This project implements a real-time, multi-user conversational AI voice agent. The system is divided into a modern web frontend and a Python-based AI backend, communicating seamlessly via WebRTC.
- Tech Stack:
Next.js,React,TypeScript. - RTC Interface: Utilizes
@livekit/components-reactto build a responsive, real-time meeting room UI. - Authentication: A
Next.js API routeuses thelivekit-server-sdkto securely generate dynamic room access tokens based on user inputs. - Real-time UI: Listens to
WebRTC Data Channelsto instantly render AI-generated transcripts. - Notes: Summarize with
react-markdownin mind map.
- Tech Stack:
Python,Pipecat AI Framework. - AI Pipeline: Orchestrates
Voice Activity Detection (VAD),Speech-to-Text (STT),LLM processing, andText-to-Speech (TTS)into a continuous, low-latency stream.
- LiveKit
- Pipecat
