AI Voice Assistant & Automation System
Talk, command, and automate your system using natural voice interaction powered by AI.
Jarvis is an AI-powered voice assistant that allows users to interact with their system using natural language and voice commands.
It acts like a personal AI companion β similar to Siri or Alexa β but enhanced with modern LLM intelligence and automation capabilities.
Jarvis can:
- π¬ Answer questions using AI
- ποΈ Understand voice commands
- π Respond using natural AI voice
- βοΈ Automate system tasks
- π Open apps and websites
- π§ Provide conversational assistance in real-time
- Real-time speech-to-text (Speech Recognition)
- Natural voice conversation
- Hands-free AI interaction
- Powered by OpenAI API
- Context-aware conversations
- Smart reasoning and explanations
- Text-to-speech using ElevenLabs
- Natural human-like voice output
- Personalized assistant voice
- Open applications (Chrome, apps, tools)
- Control system commands
- Volume up / down control
- Execute predefined system tasks
- Switch between typing and speaking
- Seamless interaction modes
- Persistent conversation flow
- User signup & login
- Secure session management
- Personalized assistant per user
graph TD
A[React Frontend] --> B[FastAPI Backend]
B --> C[OpenAI API]
B --> D[MySQL Database]
B --> E[Speech Recognition Engine]
B --> F[ElevenLabs TTS]
B --> G[System Automation Layer]
| Layer | Technologies |
|---|---|
| Backend | Python, FastAPI, MySQL |
| Frontend | React.js |
| AI Engine | OpenAI API |
| Voice Output | ElevenLabs API (TTS) |
| Speech Input | SpeechRecognition API |
| Automation | OS-level command execution |
| DevOps | Docker (optional) |
git clone https://github.com/your-username/jarvis.git
cd jarvisCreate a .env file in the project root:
DATABASE_URL=
OPENAI_API_KEY=
ELEVENLABS_API_KEY=
JWT_SECRET=docker compose up --buildThe app should now be running locally via Docker.
- User logs in to the Jarvis system
- User speaks or types a command
- Speech is converted into text
- Backend sends the request to OpenAI
- AI generates an intelligent response
- Response is converted to speech using ElevenLabs
- Optional system command is executed
- Response is shown and spoken back to the user
π¬ AI Chat
- "What is artificial intelligence?"
- "Explain quantum computing simply"
- "Write a Python function for sorting"
βοΈ System Automation
- "Open Chrome"
- "Increase volume"
- "Open YouTube"
- "Minimize window"
- Voice-based AI assistant
- Speech-to-text integration
- AI chat system
- Text-to-speech responses
- Wake-word detection ("Hey Jarvis")
- Desktop app version
- Mobile integration
- Smart home integration
- Memory-based conversations
jarvis/
βββ backend/ # FastAPI backend (AI + logic + automation)
βββ frontend/ # React UI
βββ services/ # Speech, AI, and voice services
βββ automation/ # System command execution layer
βββ docker/ # Docker setup
βββ README.md
Jarvis helps you:
- π§ Interact with AI using natural voice
- β‘ Automate repetitive system tasks
- ποΈ Control your system hands-free
- π Get human-like AI responses
- π¬ Improve productivity through voice-first interaction
It's a next-generation AI voice assistant system designed for productivity and automation.
This project is licensed under the MIT License.
- LinkedIn: Ibad Ur Rehman Rajput
- Email: ahmedibad0012@gmail.com
- Portfolio: ibadrajputportfolio.netlify.app
β If you like Jarvis, consider giving this repository a star!