Skip to content

Latest commit

 

History

History
113 lines (90 loc) · 3.59 KB

File metadata and controls

113 lines (90 loc) · 3.59 KB

Google Cloud TTS Setup Guide

Overview

Your language learning app now uses Google Cloud Text-to-Speech for realistic-sounding Spanish voices instead of the browser's robotic SpeechSynthesisUtterance API.

Setup Steps

1. Create Google Cloud Project

  1. Go to Google Cloud Console
  2. Create a new project or select an existing one
  3. Note your project ID Project ID: language-learning-app-476110

2. Enable Text-to-Speech API

  1. In the Google Cloud Console, go to "APIs & Services" > "Library"
  2. Search for "Text-to-Speech API"
  3. Click on it and press "Enable"

3. Create Service Account

  1. Go to "IAM & Admin" > "Service Accounts"
  2. Click "Create Service Account"
  3. Give it a name like "tts-service" tts-service@language-learning-app-476110.iam.gserviceaccount.com
  4. Grant it the "Cloud Text-to-Speech API User" role
  5. Click "Create and Continue"
  6. Click "Done"

4. Download Service Account Key

  1. Click on your newly created service account
  2. Go to the "Keys" tab
  3. Click "Add Key" > "Create new key"
  4. Choose "JSON" format
  5. Download the key file and save it securely

5. Set Environment Variables

Create a .env file in your backend directory with:

# Option 1: Use service account key file
GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json

# Option 2: Use project ID (if using default credentials)
GOOGLE_CLOUD_PROJECT_ID=your-project-id

6. Test the Integration

  1. Start your backend: cd backend && npm run dev
  2. Start your frontend: cd frontend && npm run dev
  3. Go to the chat page and send a message
  4. You should hear a much more realistic Spanish voice!

Features

✅ What's New

  • Realistic Spanish voices from Google Cloud TTS
  • Automatic fallback to browser SpeechSynthesis if TTS API fails
  • Error handling with proper logging
  • Async audio playback that doesn't block the UI
  • Emoji filtering (same as before)
  • Configurable voice settings (rate, pitch, gender)

🔧 Configuration Options

The TTS service supports these parameters:

  • languageCode: Language (default: 'es-ES')
  • voiceName: Specific voice to use
  • ssmlGender: 'MALE', 'FEMALE', or 'NEUTRAL'
  • speakingRate: Speech speed (0.25 to 4.0, default: 0.9)
  • pitch: Voice pitch (-20.0 to 20.0, default: 0.0)

💰 Pricing

  • Free tier: 1 million characters per month
  • Paid: $4.00 per 1 million characters after free tier
  • Perfect for language learning apps!

API Endpoints

POST /tts/synthesize

Converts text to speech and returns MP3 audio.

Request body:

{
  "text": "Hola, ¿cómo estás?",
  "languageCode": "es-ES",
  "voiceName": "es-ES-Standard-A",
  "ssmlGender": "FEMALE",
  "speakingRate": 0.9,
  "pitch": 0.0
}

GET /tts/voices

Returns available voices for a language.

Query parameters:

  • languageCode: Language code (default: 'es-ES')

Troubleshooting

Common Issues

  1. "TTS API error: 401" - Check your service account credentials
  2. "TTS API error: 403" - Make sure Text-to-Speech API is enabled
  3. "TTS API error: 429" - You've exceeded the free tier limit
  4. Audio not playing - Check browser console for errors

Fallback Behavior

If the Google Cloud TTS API fails for any reason, the app automatically falls back to the browser's built-in SpeechSynthesisUtterance API, so your app will always have some form of text-to-speech.

Next Steps

  • Test different Spanish voices by calling the /tts/voices endpoint
  • Consider adding voice selection UI for users
  • Monitor usage to stay within free tier limits
  • Add caching for frequently used phrases