Skip to content

GautamVhavle/BrowserLLM

Repository files navigation

BrowserLLM Logo

BrowserLLM

Run 100+ AI models entirely in your browser β€” no servers, no API keys, 100% private.

Live Demo

Stars Forks Issues License PRs Welcome

Features Β· How It Works Β· Tech Stack Β· Getting Started Β· Project Structure Β· Contributing Β· License


Think ChatGPT, but running on your hardware with zero cloud dependency.
All inference runs locally via WebGPU. Your conversations never leave your device.



✨ Features

🧠 AI, Fully Local

  • 100+ models β€” Llama, Qwen, Phi, Gemma, Mistral, DeepSeek, SmolLM & more
  • WebGPU-accelerated β€” Near-native GPU inference right in the browser
  • Real-time streaming β€” Tokens stream as they're generated, rendered as Markdown

πŸ”’ Privacy First

  • Zero servers β€” No backend, no API calls, no data uploaded anywhere
  • Works offline β€” PWA with service worker; fully functional without internet
  • Local storage only β€” Chats saved in your browser, nothing touches the cloud

πŸ’¬ Chat Experience

  • Multi-thread β€” Create, switch, and manage multiple conversations
  • Per-message stats β€” Tokens/sec, context usage, generation time for every response
  • Markdown rendering β€” Code blocks with syntax highlighting, tables, lists
  • Model name per message β€” See which model generated each response

⚑ Smart Model Management

  • Background downloads β€” Download new models while chatting with the current one
  • Browser cache β€” Models cached after first download, load in seconds next time
  • Hardware detection β€” Auto-detects GPU, VRAM, WebGPU & shader-f16 support
  • Default model β€” Set your preferred model, auto-selected when you open chat

πŸ”§ How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Browser                           β”‚
β”‚                                                          β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   postMessage   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚   β”‚   React UI   β”‚ ◄────────────► β”‚   Web Worker    β”‚  β”‚
β”‚   β”‚  (main       β”‚                 β”‚  (MLC Engine)   β”‚  β”‚
β”‚   β”‚   thread)    β”‚                 β”‚                 β”‚  β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚          β”‚                                  β”‚           β”‚
β”‚          β”‚ localStorage                     β”‚ WebGPU    β”‚
β”‚          β–Ό                                  β–Ό           β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚   β”‚  Chat        β”‚                 β”‚  Your GPU       β”‚  β”‚
β”‚   β”‚  History     β”‚                 β”‚  (VRAM)         β”‚  β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                          β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚    Cache API β€” Model weights persisted locally    β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Download once β€” Quantized weights fetched from HuggingFace, stored in the browser Cache API
  2. Web Worker isolation β€” MLC engine runs in a dedicated worker to keep the UI silky smooth
  3. GPU inference β€” All matrix ops run on your GPU via WebGPU at near-native speed
  4. Stream to UI β€” Tokens stream back in real-time via postMessage, rendered as rich Markdown
  5. Persist locally β€” Chats saved to localStorage, restored on reload

πŸ›  Tech Stack

Layer Technology
Framework React 19 + TypeScript
Build Vite
Styling Tailwind CSS v4
LLM Runtime @mlc-ai/web-llm via WebGPU
Routing React Router v7
Animations Framer Motion
Icons Lucide React
Markdown react-markdown + remark-gfm
PWA Service Worker + Web App Manifest
Analytics Vercel Analytics

🌐 Browser Support

Browser Minimum Version Status
Chrome 113+ βœ… Supported
Edge 113+ βœ… Supported
Safari 18.2+ βœ… Supported
Firefox β€” ❌ No WebGPU yet

Hardware: Small models (~0.5B) work with 2 GB VRAM. Larger models (7B+) need a dedicated GPU with 6–8 GB+ VRAM.


πŸš€ Getting Started

Prerequisites

  • Node.js 18+
  • A WebGPU-compatible browser

Quick Start

# Clone the repo
git clone https://github.com/GautamVhavle/BrowserLLM.git
cd BrowserLLM

# Install dependencies
npm install

# Start dev server
npm run dev

Open http://localhost:5173 and you're running.

Production Build

npm run build     # TypeScript check + Vite production build
npm run preview   # Preview the built app locally

All Scripts

Command What it does
npm run dev Dev server with hot reload
npm run build Type-check β†’ production build
npm run preview Serve production build locally
npm run lint Run ESLint

πŸ“ Project Structure

BrowserAI/
β”œβ”€β”€ public/
β”‚   β”œβ”€β”€ favicon.svg              # App icon
β”‚   β”œβ”€β”€ manifest.json            # PWA manifest
β”‚   β”œβ”€β”€ sw.js                    # Service worker
β”‚   β”œβ”€β”€ robots.txt               # Search engine crawl rules
β”‚   └── sitemap.xml              # XML sitemap
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.tsx                 # Entry β€” React root + SW registration
β”‚   β”œβ”€β”€ App.tsx                  # Route definitions
β”‚   β”œβ”€β”€ index.css                # Global styles + Tailwind
β”‚   β”‚
β”‚   β”œβ”€β”€ types/index.ts           # Shared interfaces (Message, ChatSession, etc.)
β”‚   β”‚
β”‚   β”œβ”€β”€ hooks/
β”‚   β”‚   β”œβ”€β”€ useWebLLM.ts         # Core β€” model loading, inference, stats
β”‚   β”‚   β”œβ”€β”€ useChatManager.ts    # Chat state β€” wraps useWebLLM + localStorage
β”‚   β”‚   β”œβ”€β”€ useHardwareDetect.ts # GPU/VRAM/WebGPU detection
β”‚   β”‚   β”œβ”€β”€ useModelCache.ts     # Cache API introspection
β”‚   β”‚   └── useOnlineStatus.ts   # Online/offline detection
β”‚   β”‚
β”‚   β”œβ”€β”€ lib/
β”‚   β”‚   β”œβ”€β”€ modelCatalog.ts      # 100+ model definitions with metadata
β”‚   β”‚   β”œβ”€β”€ models.ts            # Public API re-exports
β”‚   β”‚   β”œβ”€β”€ storage.ts           # localStorage CRUD
β”‚   β”‚   β”œβ”€β”€ constants.ts         # Landing page content data
β”‚   β”‚   └── animations.ts        # Framer Motion variants
β”‚   β”‚
β”‚   β”œβ”€β”€ workers/
β”‚   β”‚   └── engine.worker.ts     # Web Worker β€” MLC engine off main thread
β”‚   β”‚
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ chat/                # Chat UI (layout, messages, input, stats, sidebar)
β”‚   β”‚   β”œβ”€β”€ landing/             # Landing page sections (hero, features, FAQ, etc.)
β”‚   β”‚   └── ui/                  # Shared UI (star field, loading bar, indicators)
β”‚   β”‚
β”‚   └── pages/
β”‚       └── ModelsPage.tsx       # Full model catalog with filters + hardware compat
β”‚
β”œβ”€β”€ index.html                   # HTML shell with SEO meta, structured data
β”œβ”€β”€ vite.config.ts               # Vite + Tailwind plugin config
β”œβ”€β”€ tsconfig.json                # TypeScript config
β”œβ”€β”€ package.json
└── eslint.config.js

🀝 Contributing

BrowserLLM is open source and contributions are welcome!

Whether it's a bug fix, new feature, documentation improvement, or just a typo β€” every contribution helps.

How to contribute

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m "feat: add amazing feature"
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

Code conventions

  • TypeScript strict mode
  • Functional React components with hooks
  • Tailwind CSS for all styling (no CSS modules)
  • lucide-react for icons
  • Barrel exports via index.ts files

Ideas for contributions

  • 🌍 Internationalization (i18n)
  • πŸ“± Mobile UX improvements
  • 🎨 Theme customization
  • πŸ“Š Advanced model benchmarking
  • πŸ§ͺ Test coverage
  • πŸ“ Documentation

πŸ“„ License

This project is open source under the MIT License.

Free to use, modify, and distribute.


Built with ❀️ by Gautam Vhavle

Star this repo

About

Run 100+ open-source LLMs entirely in your browser. No servers, no API keys, 100% private. Powered by WebGPU and WebLLM.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages