Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ This repository provides the source code for three plugins in [Dify](https://git
| Transcribe Tool | SAAS | AWS transcribe service tool (ASR) | | [river xie](chuanxie@amazon.com) |
| Bedrock Retriever | PAAS | Amazon Bedrock knowledge base retrieval tool | | [ychchen](ychchen@amazon.com) |
| S3 Operator | SAAS | Read and write S3 bucket content, can return presigned URLs | | [ybalbert](ybalbert@amazon.com) |
| S3 File Uploader | SAAS | Upload a workflow file (file variable) to S3 and optionally return a presigned URL | | [leoou](leoou@amazon.com) |
| S3 File Download | SAAS | Download an S3 object as a Dify file variable for downstream nodes | | [leoou](leoou@amazon.com) |
| AWS Bedrock Nova Canvas | SAAS | Generate images based on Amazon Nova Canvas | | [alexwuu](alexwuu@amazon.com) |
| AWS Bedrock Nova Reel | SAAS | Generate videos based on Amazon Nova Reel | | [alexwuu](alexwuu@amazon.com) |
| OpenSearch Knn Retriever | PAAS | Retrieve data from OpenSearch using KNN method | [Notebook](https://github.com/aws-samples/dify-aws-tool/tree/main/notebook/search_img_by_img) | [ybalbert](ybalbert@amazon.com) |
Expand Down
2 changes: 2 additions & 0 deletions README_JA.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@
| Transcribe Tool | SAAS | AWS transcribeサービスツール (ASR) | | [river xie](chuanxie@amazon.com) |
| Bedrock Retriever | PAAS | Amazon Bedrockナレッジベース検索ツール | | [ychchen](ychchen@amazon.com) |
| S3 Operator | SAAS | S3バケットのコンテンツの読み書き、署名付きURLの返却が可能 | | [ybalbert](ybalbert@amazon.com) |
| S3 File Uploader | SAAS | ワークフロー内の file 変数を S3 にアップロードし、必要に応じて署名付きURLを返却 | | [leoou](leoou@amazon.com) |
| S3 File Download | SAAS | S3 オブジェクトを Dify の file 変数として取得し、下流ノードに渡す | | [leoou](leoou@amazon.com) |
| AWS Bedrock Nova Canvas | SAAS | Amazon Nova Canvasに基づく画像生成 | | [alexwuu](alexwuu@amazon.com) |
| AWS Bedrock Nova Reel | SAAS | Amazon Nova Reelに基づく動画生成 | | [alexwuu](alexwuu@amazon.com) |
| OpenSearch Knn Retriever | PAAS | KNN手法を使用してOpenSearchからデータを検索 | [Notebook](https://github.com/aws-samples/dify-aws-tool/tree/main/notebook/search_img_by_img) | [ybalbert](ybalbert@amazon.com) |
Expand Down
2 changes: 2 additions & 0 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@
| Transcribe Tool | SAAS | AWS transcribe service tool (ASR) | | [river xie](chuanxie@amazon.com) |
| Bedrock Retriever | PAAS | Amazon Bedrock知识库检索工具 | | [ychchen](ychchen@amazon.com) |
| S3 Operator | SAAS | 读写S3中bucket的内容,可以返回presignURL | | [ybalbert](ybalbert@amazon.com) |
| S3 File Uploader | SAAS | 将工作流中的 file 变量上传到 S3, 可选返回 presignURL | | [leoou](leoou@amazon.com) |
| S3 File Download | SAAS | 从 S3 下载对象为 Dify file 变量, 供下游节点使用 | | [leoou](leoou@amazon.com) |
| AWS Bedrock Nova Canvas | SAAS | 基于Amazon Nova Canvas生成图像 | | [alexwuu](alexwuu@amazon.com) |
| AWS Bedrock Nova Reel | SAAS | 基于Amazon Nova Reel生成视频 | | [alexwuu](alexwuu@amazon.com) |
| OpenSearch Knn Retriever | PAAS | 用KNN方法从OpenSearch召回数据 | [Notebook](https://github.com/aws-samples/dify-aws-tool/tree/main/notebook/search_img_by_img) | [ybalbert](ybalbert@amazon.com) |
Expand Down
2 changes: 2 additions & 0 deletions plugins/aws_tools/provider/aws_tools.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ tools:
- tools/bedrock_retrieve_and_generate.yaml
- tools/opensearch_knn_search.yaml
- tools/s3_operator.yaml
- tools/s3_file_uploader.yaml
- tools/s3_file_download.yaml
- tools/apply_guardrail.yaml
- tools/nova_canvas.yaml
- tools/transcribe_asr.yaml
Expand Down
159 changes: 159 additions & 0 deletions plugins/aws_tools/tools/s3_file_download.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
"""
Location: tools/s3_file_download.py
Purpose: Download an S3 object as a Dify file variable for downstream workflow nodes.

This tool complements ``s3_file_uploader``. It accepts an ``s3://bucket/key`` URI,
fetches the object via boto3, and emits the binary as a Dify file plus metadata as
both JSON and a key=value text block, so downstream nodes can treat it as a file
input or read individual fields like ``content_length`` / ``etag``.
"""

from __future__ import annotations

from collections.abc import Generator
from typing import Any, Optional
from urllib.parse import urlparse

import boto3
from botocore.exceptions import ClientError

from dify_plugin import Tool
from dify_plugin.entities.tool import ToolInvokeMessage

# ---------------------------------------------------------------------------
# Inline credential helpers (kept self-contained on purpose so this tool does
# not rely on a shared utils/ module that the rest of this repo does not use).
# ---------------------------------------------------------------------------


def _resolve_aws_credentials(
tool: Any, tool_parameters: dict[str, Any]
) -> dict[str, Optional[str]]:
runtime_credentials = getattr(getattr(tool, "runtime", None), "credentials", {}) or {}

aws_access_key_id = tool_parameters.get("aws_access_key_id") or runtime_credentials.get(
"aws_access_key_id"
)
aws_secret_access_key = tool_parameters.get("aws_secret_access_key") or runtime_credentials.get(
"aws_secret_access_key"
)
aws_session_token = tool_parameters.get("aws_session_token")
aws_region = (
tool_parameters.get("aws_region") or runtime_credentials.get("aws_region") or "us-east-1"
)

return {
"aws_access_key_id": aws_access_key_id,
"aws_secret_access_key": aws_secret_access_key,
"aws_session_token": aws_session_token,
"aws_region": aws_region,
}


def _build_boto3_client_kwargs(credentials: dict[str, Optional[str]]) -> dict[str, Any]:
kwargs: dict[str, Any] = {}
if credentials.get("aws_region"):
kwargs["region_name"] = credentials["aws_region"]
if credentials.get("aws_access_key_id") and credentials.get("aws_secret_access_key"):
kwargs["aws_access_key_id"] = credentials["aws_access_key_id"]
kwargs["aws_secret_access_key"] = credentials["aws_secret_access_key"]
if credentials.get("aws_session_token"):
kwargs["aws_session_token"] = credentials["aws_session_token"]
return kwargs


# ---------------------------------------------------------------------------
# Tool implementation
# ---------------------------------------------------------------------------


def _build_metadata_text(metadata: dict[str, Any]) -> str:
"""Render a simple ``key: value`` block for human-readable downstream display."""
lines = []
for key, value in metadata.items():
if value is None:
continue
lines.append(f"{key}: {value}")
return "\n".join(lines)


class S3FileDownload(Tool):
"""Download an S3 object as a Dify file variable.

The boto3 client is created as a local variable inside ``_invoke`` (instead
of cached on ``self``) to keep this tool safe across concurrent workflow
executions: tool instances may be reused by the plugin runtime, and a
cached client tied to one tenant's credentials must never leak into
another invocation.
"""

def _invoke(self, tool_parameters: dict[str, Any]) -> Generator[ToolInvokeMessage, None, None]:
"""Download an S3 object and emit it as a Dify file plus metadata."""
try:
credentials = _resolve_aws_credentials(self, tool_parameters)
client_kwargs = _build_boto3_client_kwargs(credentials)
s3_client = boto3.client("s3", **client_kwargs)
except Exception as exc: # pragma: no cover - boto3 init errors
yield self.create_text_message(f"Failed to initialize AWS client: {exc}")
return

s3_uri = tool_parameters.get("s3_uri")
if not s3_uri:
yield self.create_text_message("s3_uri parameter is required")
return

parsed_uri = urlparse(s3_uri)
if parsed_uri.scheme != "s3" or not parsed_uri.netloc or not parsed_uri.path:
yield self.create_text_message("Invalid S3 URI format. Use s3://bucket/key")
return

bucket = parsed_uri.netloc
key = parsed_uri.path.lstrip("/")

try:
response = s3_client.get_object(Bucket=bucket, Key=key)
file_bytes = response["Body"].read()
except ClientError as exc:
error_code = exc.response.get("Error", {}).get("Code")
if error_code == "NoSuchBucket":
yield self.create_text_message(f"Bucket '{bucket}' does not exist")
return
if error_code == "NoSuchKey":
yield self.create_text_message(
f"Object '{key}' does not exist in bucket '{bucket}'"
)
return
error_message = exc.response.get("Error", {}).get("Message", str(exc))
yield self.create_text_message(f"Failed to download S3 object: {error_message}")
return
except Exception as exc:
yield self.create_text_message(f"Failed to download S3 object: {exc}")
return

# Tolerate trailing slashes in the key (e.g. s3://bucket/path/) so the
# filename never ends up empty.
filename = key.rstrip("/").split("/")[-1] if key else "downloaded_file"
if not filename:
filename = "downloaded_file"
content_type = response.get("ContentType") or "application/octet-stream"
metadata_dict = {
"bucket": bucket,
"key": key,
"content_type": content_type,
"content_length": response.get("ContentLength"),
"etag": response.get("ETag"),
"last_modified": (
response.get("LastModified").isoformat() if response.get("LastModified") else None
),
"s3_uri": s3_uri,
}
metadata_text = _build_metadata_text(metadata_dict)

blob_meta = {
"filename": filename,
"mime_type": content_type,
"s3_uri": s3_uri,
}
yield self.create_blob_message(file_bytes, meta=blob_meta)
yield self.create_json_message(metadata_dict)
yield self.create_text_message(metadata_text or f"bucket: {bucket}\nkey: {key}")
78 changes: 78 additions & 0 deletions plugins/aws_tools/tools/s3_file_download.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
identity:
name: s3_file_download
author: AWS
label:
en_US: AWS S3 File Download
zh_Hans: AWS S3 文件下载
pt_BR: AWS S3 File Download
description:
human:
en_US: Download files from Amazon S3.
zh_Hans: 从 Amazon S3 下载文件。
pt_BR: Faz download de arquivos do Amazon S3.
llm: Download files from Amazon S3 so workflows can access the binary and metadata directly.
parameters:
- name: aws_access_key_id
type: string
required: false
label:
en_US: AWS Access Key ID
zh_Hans: AWS Access Key ID
pt_BR: AWS Access Key ID
human_description:
en_US: Override the provider Access Key ID for this tool if needed.
zh_Hans: 当需要覆盖默认的 Access Key ID 时使用。
pt_BR: Sobrescreve o Access Key ID padrão da provider quando necessário.
form: form
- name: aws_secret_access_key
type: string
required: false
label:
en_US: AWS Secret Access Key
zh_Hans: AWS Secret Access Key
pt_BR: AWS Secret Access Key
human_description:
en_US: Override the provider Secret Access Key for this tool if needed.
zh_Hans: 当需要覆盖默认的 Secret Access Key 时使用。
pt_BR: Sobrescreve o Secret Access Key padrão da provider quando necessário.
form: form
- name: aws_session_token
type: string
required: false
label:
en_US: AWS Session Token
zh_Hans: AWS Session Token
pt_BR: AWS Session Token
human_description:
en_US: AWS session token for temporary credentials (STS). Only supported in tool parameters, not at provider level.
zh_Hans: 临时凭证(STS)的 session token。仅支持在工具参数中传入,provider 级别不支持。
pt_BR: Token de sessão da AWS para credenciais temporárias (STS). Só é suportado nos parâmetros da ferramenta, não no nível da provider.
form: form
- name: aws_region
type: string
required: false
label:
en_US: AWS Region
zh_Hans: AWS 区域
pt_BR: AWS Region
human_description:
en_US: Override the default AWS Region for this tool.
zh_Hans: 为该工具指定 AWS 区域,覆盖 provider 默认值。
pt_BR: Sobrescreve a região AWS padrão para esta ferramenta.
form: form
- name: s3_uri
type: string
required: true
label:
en_US: S3 URI
zh_Hans: S3 URI
pt_BR: S3 URI
human_description:
en_US: Target object in s3://bucket/key format.
zh_Hans: 以 s3://bucket/key 形式指定下载对象。
pt_BR: Objeto alvo no formato s3://bucket/key.
llm_description: S3 URI of the object to read.
form: llm
extra:
python:
source: tools/s3_file_download.py
Loading