A real-time audio chat interface powered by OpenAI's Realtime API and ElevenLabs text-to-speech. Nocturne AI provides low-latency conversational AI with audio processing, mixing, visual effects, and MIDI control capabilities.
- Real-time Voice Chat: Low-latency conversational AI using OpenAI's GPT-4o Realtime model
- Professional Audio Processing: Audio effects (reverb, delay, compression, EQ, distortion) powered by Tuna.js
- Audio Mixing: Multi-channel audio mixer with voice control and panoramic effects
- Text-to-Speech: High-quality synthesis via ElevenLabs with multiple voice options
- Speech-to-Text: Real-time transcription using ElevenLabs STT
- 3D Visualization: Interactive particle system and audio visualizers built with Three.js and React Three Fiber
- MIDI Support: Full MIDI controller integration for parameter control
- Transcript Display: Real-time conversation transcript with visual effects
- Export/Import: Save and load audio config, midi parameters
Frontend:
- Next.js 16 - React framework
- React 19 - UI library
- Three.js & React Three Fiber - 3D graphics
- TailwindCSS - Styling
- TypeScript - Type safety
- Tuna.js - Web Audio API effects
Backend/APIs:
- OpenAI Agents SDK - Realtime API
- ElevenLabs - TTS & STT
Python Backend (Optional):
- OpenAI Whisper - Speech recognition
- PyTorch - ML framework
- PySimpleGUI - Desktop UI
- Node.js 20+
- npm or yarn package manager
- OpenAI API key (for Realtime API access)
- ElevenLabs API key (for TTS/STT services)
- Clone the repository:
git clone https://github.com/yourusername/NocturneAI.git
cd NocturneAI- Install dependencies:
npm install- Set up environment variables:
cp env.example .env.local- Add your API keys to
.env.local:
OPENAI_API_KEY=your_openai_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
npm run devOpen http://localhost:3000 in your browser to access the application.
The app will hot-reload as you make changes to the code.
npm run build
npm start- Voices: Edit src/app/voices.json to customize available voice options
- Effects: Configure audio effects in the UI or via MIDI controller
- Visual Settings: Adjust particle brightness and text display speed in the transcript panel
POST /api/tts- Text-to-speech synthesisPOST /api/stt/elevenlabs-token- Get ElevenLabs STT tokenPOST /api/ephemeral- Get OpenAI Realtime ephemeral token
src/
├── app/
│ ├── AudioChatClean.tsx # Main chat component
│ ├── audiofx.ts # Audio effects chain
│ ├── audioMixer.ts # Multi-channel mixer
│ ├── midi.tsx # MIDI controller support
│ ├── voices.json # Voice configurations
│ ├── api/ # Backend API routes
│ └── scribe/ # Real-time transcription
├── docs/ # Documentation
└── python/ # Optional Python backend
npm run lintnpx tsc --noEmitSee LICENSE file for details.
For issues and feature requests, please open an issue on the GitHub repository.