Streaming text live at the cursor using local Whisper.
Press a hotkey. Speak. Text appears at your cursor — live, as you talk — in any app on your machine. Runs entirely on the Apple Neural Engine using open-source Whisper. Zero cost per transcription, forever.
I type fast. I think faster. I constantly lose thoughts mid-sentence because my hands can't keep up with my brain. Every voice-to-text tool I tried either:
- Sent my audio to a cloud server I don't control
- Charged per minute or per character
- Only worked inside one specific app
- Required me to stop what I was doing and switch contexts
So I built MindScript. One hotkey. Fully local Whisper inference on the Apple Neural Engine. Live text streaming directly at your cursor, in any app, in any language. After the first model download (~75MB, one-time), it costs absolutely nothing to run.
I open-sourced it because this should exist for everyone, for free, without a per-minute tax to a cloud provider.
No Xcode required — just the Xcode Command Line Tools and Swift.
Open Terminal and run these commands:
git clone https://github.com/qasimtalkin/mindscript
cd mindscript/MindScriptYou're now in the mindscript/MindScript directory. All remaining commands run from here.
bash install.shThis script compiles MindScript, wraps it into a proper .app bundle with all required frameworks, signs it, and installs it to /Applications. This is necessary for macOS to grant Microphone and Accessibility permissions.
Once installed, launch it:
open /Applications/MindScript.appA mic icon appears in your menu bar (top-right of screen). Click it to open the control panel.
On first launch, MindScript automatically downloads the Whisper model (~75 MB) in the background. A progress bar appears in the menu bar popover. No manual step needed — just wait a moment and it's ready.
MindScript needs two one-time permissions:
- Microphone — to record your voice
- Accessibility — to type transcriptions into other apps
If permission dialogs don't appear automatically:
- Go to System Settings → Privacy & Security → Accessibility
- Find MindScript and toggle it on
- Restart the app
- Open any app (Slack, Notion, Mail, Google Docs, anything).
- Click where you want text to appear.
- Press
⌥ 0(Option + 0) to start recording — a recording pill appears on screen. - Speak naturally. Text streams live as you talk.
- Press
⌥ Spaceto pause / resume at any time. - Press
⌥ Escwhen done — text is injected at your cursor.
MindScript uses WhisperKit to run OpenAI's Whisper models locally. By default, it is optimized for speed and low memory usage.
| Model | Size | Best For | Requirement |
|---|---|---|---|
| Tiny | ~75MB | Fast dictation, low RAM | 2GB+ RAM (Default) |
| Base | ~145MB | General purpose accuracy | 4GB+ RAM |
| Small | ~450MB | Professional transcription | 8GB+ RAM |
| Large-v3 | ~1.5GB | Near-perfect multilingual | 16GB+ RAM (M2+) |
Click the mic icon in the menu bar and use the Model picker — no rebuild required:
- Whisper Tiny (~75 MB) — fastest, lowest RAM, great for dictation
- Whisper Base (~145 MB) — better accuracy, still lightweight
The selected model is saved and reloaded automatically on next launch. Switching triggers an automatic re-download if the new model isn't cached yet.
Supports 17 languages out of the box. Default is Auto-detect — Whisper figures out the language from your speech.
Auto · English · Spanish · French · German · Italian · Portuguese · Russian · Chinese · Japanese · Korean · Arabic · Hindi · Urdu · Turkish · Dutch · Polish
Pin a specific language in Settings for faster, more accurate results.
MindScript can automatically summarise your voice notes into concise bullet points immediately after transcription.
- Click the mic icon in your menu bar.
- Toggle Auto-summarize to ON.
- To change models or providers, go to Settings (via the menu bar icon):
- Provider: Choose between Ollama (Local), OpenAI, or Anthropic (Claude).
- Model: Default is
glm-4.7-flash:latestfor Ollama. - API Key: Required only for OpenAI and Anthropic.
Tip
Ollama Users: Make sure you have the model pulled locally before use:
ollama pull glm-4.7-flash:latest| Whisper tiny model | ~75 MB, auto-downloaded on first launch |
| Whisper base model | ~145 MB, downloaded when selected in the menu |
| Everything else | Nothing — zero telemetry, zero analytics |
| Where it's stored | MindScript/Models/ — repo-local, gitignored |
Your audio never leaves your machine.
| Runtime | Swift + SwiftUI, macOS 13+ |
| Speech-to-text | WhisperKit — CoreML + Apple Neural Engine |
| Global hotkey | HotKey |
| Audio capture | AVFoundation, 16 kHz mono |
| Text injection | CGEvent Cmd+V + AppleScript fallback |
| Build | Swift Package Manager — no Xcode project needed |
- macOS 14.0+
- Apple Silicon (Intel works but runs slower — Neural Engine is ARM-only)
- Microphone access
- Accessibility access (for text injection — one-time grant, survives rebuilds)
Contributions make the open-source community an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
MIT — Use it, fork it, ship it, sell it. Credit appreciated but not required.
- WhisperKit by Argmax — the CoreML/ANE wrapper that makes on-device Whisper fast on Apple Silicon
- OpenAI Whisper — the underlying model weights
- HotKey by Sam Soffes — the cleanest global hotkey library for macOS
- Sparkle — the standard for macOS auto-updates