feat(aws_tools): add S3 File Uploader and S3 File Download tools#168
Open
leoou331 wants to merge 2 commits into
Open
feat(aws_tools): add S3 File Uploader and S3 File Download tools#168leoou331 wants to merge 2 commits into
leoou331 wants to merge 2 commits into
Conversation
Add two new builtin tools to the aws_tools plugin so that Dify workflows can move file objects (not just text) between workflow nodes and S3: - s3_file_uploader: takes a file variable from an upstream node and uploads it to a configurable bucket/key, optionally returning a presigned URL. - s3_file_download: takes an s3://bucket/key URI and emits a Dify file (via create_blob_message) plus structured metadata for downstream consumption. Why --- The existing s3_operator only handles text payloads (text_content in, UTF-8 text out), so it can't be wired directly to a Start node 'file' input or to any tool that emits binary file variables. These two tools close that gap with the same UX and parameter conventions as s3_operator. Implementation notes -------------------- - Both tools are self-contained (credential-resolution helpers are inlined) so this PR does not introduce a shared utils/ module. - They reuse the existing aws_tools provider's credentials_for_provider schema (Access Key / Secret Key / Region) and additionally accept a per-invocation aws_session_token for STS / role-assumption use cases. - Three-language labels (en_US / zh_Hans / pt_BR) match the rest of the plugin's tools. Validation ---------- - Static: yaml.safe_load all touched yaml files; py_compile both .py files; verified extra.python.source paths resolve correctly. - End-to-end: packaged plugins/aws_tools/ from this branch into a .difypkg, installed it on a self-hosted Dify 1.14.2 instance, and ran a workflow [Start -> S3 Upload -> S3 Download -> End] against cn-northwest-1. status=succeeded, total_steps=4, elapsed=1.0s. Object SHA-256 verified byte-for-byte identical between local file and the S3 round-trip. Origin ------ Implementation derived from the public S3 file uploader/download tools in r3-yamauchi/dify-my-aws-tools-plugin (Apache-2.0). Author confirmed he is happy for these two tools to be contributed upstream into aws-samples/dify-aws-tool with no attribution requirement; comments have been translated to English to match the surrounding files.
…ins#3273 Same set of fixes applied to the companion PR on the upstream langgenius/dify-official-plugins repo (#3273), surfaced by gemini-code-assist review: 1. Thread safety: replace cached self.s3_client with a local boto3 client created inside each _invoke. Drops the helper functions _reset_clients_on_credential_change and _credential_signature. 2. Standardised ClientError error-code matching for NoSuchBucket / NoSuchKey (no longer relies on the dropped instance-attribute exceptions namespace). 3. Tolerate trailing slashes in the S3 key when deriving filename. 4. Safe presign_expiry parsing (None / empty / non-numeric all fall back to 3600 instead of crashing with TypeError). Re-validated end to end: TXT / PNG / PDF / presign URL / STS session token paths all succeed with byte-for-byte SHA-256 match.
Contributor
Author
|
Pushed 546e2a3 to mirror review fixes from the upstream langgenius PR (langgenius/dify-official-plugins#3273):
Re-validated end to end on Dify 1.14.2: TXT / PNG / PDF / presign URL / STS |
crazywoola
pushed a commit
to langgenius/dify-official-plugins
that referenced
this pull request
Jun 11, 2026
* feat(aws_tools): add S3 File Uploader and S3 File Download tools Add two new builtin tools to tools/aws so that Dify workflows can move file objects (not just text) between workflow nodes and S3: - s3_file_uploader: takes a file variable from an upstream node and uploads it to a configurable bucket/key, optionally returning a presigned URL. - s3_file_download: takes an s3://bucket/key URI and emits a Dify file (via create_blob_message) plus structured metadata for downstream consumption. Why --- The existing s3_operator only handles text payloads (text_content in, UTF-8 text out), so it can't be wired directly to a Start node 'file' input or to any tool that emits binary file variables. These two tools close that gap with the same UX and parameter conventions as s3_operator. Implementation notes -------------------- - Both tools are self-contained (credential-resolution helpers are inlined) so this PR does not introduce a shared utils/ module. - They reuse the existing aws_tools provider's credentials_for_provider schema (Access Key / Secret Key / Region) and additionally accept a per-invocation aws_session_token for STS / role-assumption use cases. - Three-language labels (en_US / zh_Hans / pt_BR) match the rest of the plugin's tools; identity.author follows existing convention (AWS). - Bumped manifest.yaml version from 0.0.26 to 0.0.27. - README.md Features section updated. - Code formatted with black (-l 100); ruff check passes clean. - No new dependencies (boto3/botocore already in pyproject.toml). Validation ---------- Static: - python -m py_compile on both .py files - yaml.safe_load on all touched yaml files - Verified extra.python.source paths resolve correctly - black --check + ruff check both clean on the new files End-to-end (real run, not dry validation): - Built a .difypkg from tools/aws/ on this branch - Installed it on a self-hosted Dify 1.14.2 Community Edition - Imported a workflow [Start file -> s3_file_uploader -> s3_file_download -> End], pointed at S3 in cn-northwest-1 - Triggered the workflow via the Service API with text/PNG payloads - Result: status=succeeded, total_steps=4, elapsed ~0.4-0.8s - Pulled both objects back from S3 via aws s3 cp and SHA-256 verified byte-for-byte identical with the local source files - (Companion regression run on aws-samples/dify-aws-tool#168 also covered PDF binary, generate_presign_url=true, and STS aws_session_token paths with the same code; all green and SHA-256 identical.) Origin / attribution -------------------- Implementation derived from the public s3_file_uploader.py / s3_file_download.py in r3-yamauchi/dify-my-aws-tools-plugin (Apache-2.0). The author has confirmed he is happy for these two tools to be contributed upstream to langgenius/dify-official-plugins with no attribution requirement; comments translated to English to match surrounding files. The companion aws-samples/dify-aws-tool PR #168 contains the same code. * fix(s3 tools): address gemini-code-assist review Apply concrete code-review feedback from gemini-code-assist on PR #3273: 1. Thread safety / credential leakage (high-priority) - Move boto3 client construction from cached `self.s3_client` to a local variable inside `_invoke`. Tool instances are reused by the plugin runtime across concurrent invocations, so a cached client tied to one tenant's credentials must never leak into another execution. Creating an S3 client is lightweight (no network I/O) so there is no real cost to building it per invocation. - Drop the now-unused `_reset_clients_on_credential_change` and `_credential_signature` helpers (and the `Iterable` import). They tried to address the same race but were inherently fragile under concurrency. 2. Standardised exception handling in s3_file_download - Switch from `self.s3_client.exceptions.NoSuchBucket` / `NoSuchKey` (which depended on the cached instance attribute) to standard `ClientError` error-code matching via `exc.response["Error"]["Code"]`. 3. Robust filename extraction in s3_file_download - Tolerate trailing slashes in the S3 key (e.g. `s3://bucket/foo/`) so the emitted Dify file's `filename` is never empty. 4. Safe presign_expiry parsing in s3_file_uploader - Extracted a small `_parse_presign_expiry` helper that tolerates None / empty string / non-numeric input and falls back to the default of 3600 seconds, instead of letting `int(None)` raise TypeError when the optional Dify number field is left blank. Validation ---------- - black -l 100 + ruff check both clean. - End-to-end re-validation on a fresh self-hosted Dify 1.14.2: built a .difypkg from this branch, installed it, and ran the regression matrix again - text/plain, image/png, application/pdf, generate_presign_url with curl-fetch, and STS aws_session_token via `aws sts get-session-token`. All six runs returned status=succeeded; SHA-256 byte-for-byte identical on every round-trip. Unit-tested `_parse_presign_expiry` against None / "" / 600 / "600" / "not a number" / 3.14 / custom-default; all 7 cases produce the expected fall-back behaviour. Refs ---- PR review: #3273 (review) --------- Co-authored-by: leoou331 <leoou@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Adds two new builtin tools to the
aws_toolsplugin:s3_file_uploader— takes afilevariable from an upstream workflow node (Start file input, LLM output, another tool, etc.) and uploads it to a configurable S3 bucket/key. Optionally returns a presignedGETURL.s3_file_download— takes ans3://bucket/keyURI and emits the object as a Dify file (viacreate_blob_message) plus structured metadata for downstream nodes.Why
The existing
s3_operatoris text-only:text_content: stringstringIt cannot be wired to a
Startnode'sfileinput, nor consume the file output of nodes likeFrame Extractor/Nova Canvas. These two new tools close that gap and keep the same UX as the rest of the plugin (provider-level credentials, optional per-tool overrides, three-language labels).A typical workflow now looks like:
Files
Both Python modules are self-contained — credential-resolution helpers (
_resolve_aws_credentials,_build_boto3_client_kwargs,_reset_clients_on_credential_change) are inlined so this PR does not introduce a sharedutils/module that the rest of the repo doesn't use.Tool surface
s3_file_uploader(form parameters)input_filefilebucket_namestrings3://.key_prefixstringworkflow-outputs.object_keystringaws_regionstringaws_access_key_id/aws_secret_access_key/aws_session_tokenstringgenerate_presign_urlbooleanfalse.presign_expirynumber3600seconds.Outputs three messages:
text=s3://bucket/keyor presigned URL;json={bucket_name, object_key, s3_uri, presigned_url?, presign_expiry?}; nofiles.s3_file_downloads3_uristring(LLM-fillable)s3://bucket/key.aws_region/aws_access_key_id/aws_secret_access_key/aws_session_tokenstringOutputs
files = [<Dify file>],json = {bucket, key, content_type, content_length, etag, last_modified, s3_uri}, and a key=valuetextblock of the same metadata.Validation
Static:
python -m py_compileon both.pyfiles — ✅yaml.safe_loadon both new yaml files and the modifiedprovider/aws_tools.yaml— ✅extra.python.sourcepaths resolve to existing files — ✅label/descriptionlanguages match the rest of the plugin (en_US/zh_Hans/pt_BR) — ✅End-to-end (this is from a real run, not a dry validation):
.difypkgfromplugins/aws_tools/on this branch.[Start (file input) -> s3_file_uploader -> s3_file_download -> End], pointed at S3 bucket incn-northwest-1.hello.txt(244 bytes).status = succeeded,total_steps = 4,elapsed = 1.0suploaded_uri = s3://dify-test-s3-function/pr-validation/hello.txtdownloaded_metapopulated withbucket / key / content_type / content_length=244 / etag / last_modified / s3_uriaws s3 cpand compared SHA-256 — byte-for-byte identical with the local source.Out of scope
s3_operatorbehaviorutils/module — left for a follow-up if more tools want the helpers