This solution demonstrates AWS S3/Lambda integration with Open AI speech to text API (Whisper model) to transcribe/translate audio files to the text format. AWS resources are deployed using AWS CDK infrastructure as code.
- Upload audio file to the S3 bucket.

- Lambda function sends file to the Open AI API (Whisper) to create a transcription and uploads it as a text file to output S3 bucket.

- Transcription result:
{
"text" : "He doesn't belong to you, and I don't see how you have anything to do with what is be his power yet. He's heaped us all in that from the stage to you. Be fine."
}- Upload audio file to the S3 bucket.

- Lambda function sends file to the Open AI API (Whisper) to create a translation and uploads it as a text file to output S3 bucket.

- Translation result:
{
"text" : "Hi, I'm Ramón Langa, and I'm a beautiful person. In fact, I bring home everything I win. Besides, I'm the best speaker in the world. That's it."
}This solution has the following configurable variables (in .env file). Update them according to your needs.
| Name | Value | Description |
|---|---|---|
| REGION | us-east-1 | AWS region where your infrastructure is deployed |
| OPENAI_API_KEY (required) | your_api_key | Open AI API key to make calls to the speech to text endpoints |
| OPENAI_API_RESPONSE_FORMAT | json | Open AI API response format. Supported values: json, text, srt, verbose_json, vtt |
| INPUT_TRANSCRIPTION_BUCKET_NAME | your-transcription-input-bucket | S3 bucket where the source audio file is uploaded |
| OUTPUT_TRANSCRIPTION_BUCKET_NAME | your-transcription-output-bucket | S3 bucket where the result transcription text file is uploaded |
| INPUT_TRANSLATION_BUCKET_NAME | your-translation-input-bucket | S3 bucket where the source audio file is uploaded |
| OUTPUT_TRANSCRIPTION_BUCKET_NAME | your-translation-output-bucket | S3 bucket where the result translation text file is uploaded |
Prerequisites:
- Node.js
- AWS CDK CLI (run
npm install -g aws-cdk) - AWS credentials
If you are not familiar with AWS CDK, please check the getting started guide
Run the following command to deploy AWS resources:
cdk deploy
Run the following command to cleanup AWS resources:
cdk destroy

