Dhruv Thakur dhruvthakur2000

I publicly engineer real-world AI systems from scratch — documenting decisions, failures, tradeoffs, and mental models along the way.

Real-time AI voice sales agent. 40+ concurrent calls. Sub-600ms end-to-end latency. Zero vendor lock-in.

Component	Technology	Latency
Voice Activity Detection	Silero VAD v5 + WebRTC (dual-fusion)	6ms
Speech Recognition	Faster-Whisper Large-v3 (streaming)	150ms
Language Model	Qwen-2.5-7B-Instruct + KV-cache reuse	320ms
Text-to-Speech	Piper TTS (streaming, sentence-by-sentence)	100ms
End-to-End P95		~580ms

What makes it different:

🔁 KV-cache reuse across turns — 40% LLM latency reduction by serializing attention tensors to Redis
🎤 Adaptive end-of-turn detection — learns speaking pace per session; 420ms–720ms dynamic silence threshold
⚡ Barge-in handling — multi-signal fusion detects interruptions in <200ms, stops TTS mid-sentence
🧱 Redis-backed session persistence — full state, KV-cache, metrics per session; horizontal scaling ready
☸️ Kubernetes-ready — StatefulSet, HPA auto-scaling, PSTN via Asterisk/FreeSWITCH

Stack: Python FastAPI AsyncIO WebSockets Silero Faster-Whisper PyTorch Redis Docker Kubernetes

Project	What it does	Stack
SaleTech	Production real-time AI voice sales agent — 40+ concurrent calls, <600ms latency, full open-source pipeline	Python · FastAPI · Silero · Faster-Whisper · Qwen · Piper · Redis · K8s
linux_driver_eval	CLI framework to benchmark how well LLMs write Linux kernel device driver code. Two pipelines: generation + evaluation. Weighted scoring across correctness, security, quality, performance	Python · GCC · Together API · Static Analysis
virtual-voicebot	Streamlit voice assistant with persona-aware responses — the project that started my obsession with real-time audio pipelines	Python · Streamlit · Groq · Whisper · LLaMA · TTS
🔜 HomeAssist (planned)	Smarter Alexa — always-on edge voice assistant using SaleTech's VAD + ASR + buffer layers. Wake-word detection, local LLM, zero cloud dependency	SaleTech core · Edge inference
🔜 SaleTech Analytics (planned)	Call intelligence layer — real-time sentiment, objection detection, sales stage classification per turn	SaleTech core · NLP · Classification

I write about real engineering decisions — not tutorials copied from docs.

	Post	Platform
📡	VAD: Voice Activity Detection — how it actually works	Hashnode
🔁	Understanding the Attention Mechanism: The Heart of the Transformer Revolution	Medium
🪵	Structured Logging in Python: A Practical Guide for Production Systems	Medium

Open to ML Engineer / AI Engineer / Backend Python roles — remote or on-site.

Provide feedback