Echoes is a real-time emotion-aware mental health companion that analyzes both what you say and how you say it to provide empathetic, context-aware support. The system combines speech-to-text transcription, emotion detection from voice patterns, AI-powered conversational therapy, and natural voice responses to create a compassionate digital therapist that truly listens.
Unlike text-only chatbots, Echoes captures emotional nuance in your voice—detecting stress, fear, sadness even when words seem neutral—and responds with understanding tailored to both your words and your feelings.
- 🎤 Real-time Voice Analysis - GPU-accelerated emotion detection from voice tone and patterns
- 🧠 7 Emotion Recognition - Detects happy, sad, angry, fearful, disgusted, surprised, and neutral states
- 💬 Context-Aware Responses - Prioritizes what you say with emotion as supplementary context
- 🎯 Three Response Modes:
- Empathetic Listening - Validates feelings and explores concerns
- Actionable Advice - Provides 2-4 specific steps tailored to your situation
- Conversational Memory - Recalls past discussions and context
- 🔄 Mismatch Detection - Identifies when emotions and words don't align
- 🗣️ Text-to-Speech Responses - Natural voice replies via ElevenLabs
- 💾 Persistent History - Remembers conversations across sessions
- ⚡ Low Latency - CUDA-optimized processing for near-instant responses
- 1 in 5 adults experience mental illness annually
- Average wait time for therapy: 2-3 months
- Cost barrier: $100-250 per session without insurance
- Stigma prevents many from seeking help
- ✅ Immediate 24/7 support - No waiting lists or appointments
- ✅ Emotion-aware responses - Understands stress/fear in your voice even when you say "I'm fine"
- ✅ Judgment-free space - Complete privacy and confidentiality
- ✅ Actionable guidance - Practical advice tailored to your unique situation
- ✅ Crisis detection - Recognizes urgent situations and recommends professional help
Note: Echoes is a supportive companion, not a replacement for professional therapy. For serious mental health concerns, please consult a licensed professional.
- Windows with NVIDIA RTX GPU and drivers installed
- Python 3.10+ (3.11/3.12 recommended)
- Node.js v18+ and npm
- Git
- PowerShell (default Windows shell)
# Shows GPU and driver
nvidia-smiIf nvidia-smi is not found, install NVIDIA drivers from the NVIDIA website. You do not need full CUDA toolkit installed - only a matching driver is required for PyTorch.
git clone https://github.com/Waamer/HackWestern12_Proj.git
cd HackWestern12_ProjCreate and activate a virtual environment (PowerShell):
python -m venv .venv
Set-ExecutionPolicy -Scope Process -ExecutionPolicy RemoteSigned -Force
& ".\.venv\Scripts\Activate.ps1"
python -m pip install --upgrade pipIf you have an NVIDIA RTX and want GPU acceleration, install the matching wheels. Example for CUDA 12.1 (the environment used when developing this repo):
python -m pip install torch==2.5.1+cu121 torchaudio==2.5.1+cu121 --index-url https://download.pytorch.org/whl/cu121If you do NOT have a GPU or prefer CPU-only, use the CPU wheel instead:
python -m pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpuWe added a pinned requirements.txt at the repo root that contains the packages from the environment used to develop and test this project. To reproduce the same environment exactly, run:
pip install -r requirements.txtIf you prefer a smaller, flexible set for the backend only, install the backend minimal list (less reproducible):
pip install -r backend/requirements.txtNote: requirements.txt at the repo root includes GPU-specific torch and torchaudio wheels as pinned versions. If you switch CUDA versions, edit or reinstall the appropriate torch wheel.
Install Node dependencies from the frontend folder:
cd frontend
npm installStart the frontend dev server:
npm run dev
# Open the Local URL shown by Vite (usually http://localhost:5173)Create backend/.env (copy from .env.example) and add keys if you plan to enable Gemini or ElevenLabs features. The project will run without these keys but the LM/TTS features will be disabled.
Variables used (example):
GEMINI_API_KEY=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=
DEVICE= # optional: "cuda" or "cpu"
Never commit real API keys to git.
From the repo root (venv must be activated):
& ".\.venv\Scripts\Activate.ps1"
cd "C:\Users\<you>\Documents\Western\Year 4\Hackathon\TEST\backend"
python app.pyThe backend will run on http://127.0.0.1:5000 (Flask dev server). Keep this running while you use the frontend.
- Open the Vite URL (http://localhost:5173) in your browser.
- Click
Start, speak, then clickStop. The frontend encodes WAV client-side and sends it toPOST /api/analyze. - The backend returns emotion probabilities which the UI displays.
This repo also includes a console mic loop (real_time_emotion_server.py) that captures short chunks with sounddevice and runs local emotion detection. To run it (venv active):
python real_time_emotion_server.pyThis will print emotion probabilities to the console for each chunk.
Notes:
- The script tries to use CUDA if available.
- If you see
KeyboardInterrupttraces when stopping, that is because the script is blocking onsounddevice.wait(); use Ctrl-C, or we can add graceful shutdown handling.
- The emotion model is loaded from Hugging Face (
firdhokk/speech-emotion-recognition-with-openai-whisper-large-v3) at runtime. The first load downloads weights and may take time and disk space. - For consistent accuracy:
- Ensure audio sample rate and channels are correct (the backend resamples to the model's required sampling rate).
- Use the pinned
torch/torchaudiowheel that matches your GPU's CUDA version. - For noisy or short audio, consider running sliding-window averaging or test-time augmentation — the repo includes
emotion_model.pywhich now resamples audio correctly; you can enable TTA or windowing for improved stability.
-
ModuleNotFoundError: No module named 'emotion_model'when runningbackend/app.py:- Ensure you run
python backend/app.pyfrom the repository root and that.venvis activated.backend/app.pyinserts the project root intosys.pathautomatically.
- Ensure you run
-
Import 'torchaudio' could not be resolvedor runtime import errors:- Make sure you installed the CUDA-matching
torchaudiowheel shown in the PyTorch commands above.
- Make sure you installed the CUDA-matching
-
sounddeviceerrors for microphone:- Install
sounddevice(already in our pinnedrequirements.txt) and verify the microphone is available. Use Windows privacy settings to allow microphone access.
- Install
-
Frontend Start/Stop does nothing in the browser:
- Ensure you opened the Vite URL and the backend is running at
http://localhost:5000. - Browser may block getUserMedia — allow microphone permission when the browser asks.
- Ensure you opened the Vite URL and the backend is running at
If you plan to commit model artifacts (not recommended), enable Git LFS and track large files:
git lfs install
git lfs track "*.safetensors"
git add .gitattributes- Use
requirements.txtat the repo root for exact reproducibility. - Use
backend/requirements.txtonly for a small flexible install (not pinned). - For production deployment replace the Flask dev server with a WSGI production server (Gunicorn/uvicorn) and consider a reverse proxy.

