SpeechBot

A real-time voice-enabled chatbot that uses speech recognition, Google's Gemini Pro AI, and text-to-speech synthesis for natural conversations.

Features

🎤 Real-time speech recognition
🤖 AI-powered responses using Google's Gemini Pro
🔊 Natural text-to-speech output
💬 Interactive chat interface
⚡ Real-time response handling

Tech Stack

Frontend

HTML5
CSS3
JavaScript (ES6+)
Web Speech API
- SpeechRecognition
- SpeechSynthesis

Backend

Python 3.x
Flask
Google Generative AI (Gemini Pro)
Flask-CORS

Prerequisites

Python 3.x
Node.js and npm
Google AI API key
Modern web browser (Chrome recommended)

Installation

Clone the repository

git clone <repository-url>
cd voice-chatbot

Backend Setup

# Create and activate virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Environment Configuration

# Create .env file in the root directory
echo "GOOGLE_AI_API_KEY=your_api_key_here" > .env

Frontend Setup

# Install Vite (if not already installed)
npm install

Running the Application

Start the Backend Server
```
python app.py
```
The server will start on http://localhost:5000
Start the Frontend Development Server
```
npm run dev
```
The application will be available at http://localhost:5173

Project Structure

SpeechBot/
├── frontend/
│   ├── index.html          # Main HTML file
│   ├── style.css           # Styles
│   └── main.js             # Frontend logic
├── backend/
│   ├── app.py              # Flask server
│   └── requirements.txt     # Python dependencies
├── .env                    # Environment variables
└── README.md              # Project documentation

Components

1. Speech Recognition

Uses Web Speech API's SpeechRecognition
Continuous listening mode
Error handling for unsupported browsers

2. AI Processing

Integrates with Google's Gemini Pro
Processes natural language input
Generates contextual responses

3. Text-to-Speech

Natural voice synthesis
Configurable speech parameters
Queue management for responses

Usage

Click the "Start Listening" button
Speak your message
Wait for the AI response
The response will be displayed and spoken back

Error Handling

The application handles various error scenarios:

Speech recognition errors
Network connectivity issues
API errors
Browser compatibility issues

Security Considerations

API key protection using environment variables
CORS configuration for secure communication
Input validation on both frontend and backend
No persistent data storage

Browser Compatibility

Recommended browsers:

Google Chrome (preferred)
Microsoft Edge
Firefox
Safari (limited support)

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Google AI for Gemini Pro API
Web Speech API contributors
Flask and its community

Troubleshooting

Common Issues

Speech Recognition Not Working
- Ensure you're using a supported browser
- Check microphone permissions
- Verify stable internet connection
API Key Issues
- Verify .env file configuration
- Check API key validity
- Ensure proper environment variable loading
Backend Connection Failed
- Confirm backend server is running
- Check CORS configuration
- Verify correct port settings

Support

For issues and feature requests, please create an issue in the repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeechBot

Features

Tech Stack

Frontend

Backend

Prerequisites

Installation

Running the Application

Project Structure

Components

1. Speech Recognition

2. AI Processing

3. Text-to-Speech

Usage

Error Handling

Security Considerations

Browser Compatibility

Contributing

License

Acknowledgments

Troubleshooting

Common Issues

Support

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
app.py		app.py
index.html		index.html
main.js		main.js
requirements.txt		requirements.txt
style.css		style.css

Folders and files

Latest commit

History

Repository files navigation

SpeechBot

Features

Tech Stack

Frontend

Backend

Prerequisites

Installation

Running the Application

Project Structure

Components

1. Speech Recognition

2. AI Processing

3. Text-to-Speech

Usage

Error Handling

Security Considerations

Browser Compatibility

Contributing

License

Acknowledgments

Troubleshooting

Common Issues

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages