WhisperShortcut is a macOS menu bar app for voice-first productivity: dictation, voice editing, AI chat, text-to-speech, and live meeting transcription.
The app is local-first and bring-your-own-key. There is no backend or subscription service in this repository. Cloud features use your Google Gemini API key and, optionally, your xAI API key for Grok chat models.
- Download the latest signed
.dmgfrom GitHub Releases. - The app is also available on the Mac App Store.
- Demo video: Watch on YouTube.
- Dictate: Record speech and copy the transcription to your clipboard. Use Gemini models online or local Whisper models offline.
- Dictate Prompt: Speak an instruction that edits the current clipboard text, for example "make this shorter" or "translate this to English".
- Read Aloud: Read clipboard text aloud with Gemini TTS voices.
- Prompt & Read: Edit text with a spoken instruction, then read the result aloud automatically.
- Chat: Use a persisted multi-session chat window with Gemini or Grok models, screenshots, image attachments, slash commands, and optional web grounding.
- Google integrations: Connect a Google account so chat can work with Calendar, Tasks, and Gmail through controlled local tools.
- Live Meeting: Record meetings in chunks, transcribe them as they complete, keep an in-app transcript, and save meeting files locally.
- Smart Improvement: Let the app improve system prompts and user context from usage logs or a spoken instruction.
- macOS 15.5 or later
- Xcode 16.0 or later for development
- Google Gemini API key for cloud transcription, prompt workflows, chat, TTS, Smart Improvement, and live meetings
- Optional xAI API key for Grok chat models
- Optional Google account connection for Calendar, Tasks, and Gmail tools
Offline Whisper dictation works without an API key after downloading a local model in Settings.
- Download the latest
.dmgfrom Releases. - Open the DMG and drag
WhisperShortcut.appinto/Applications. - Start WhisperShortcut from Applications.
- Open Settings from the menu bar app and add your Google Gemini API key if you want cloud features.
- Grant microphone and accessibility permissions when macOS asks.
- Configure a Dictate shortcut in Settings.
- Choose a Gemini or Whisper transcription model.
- Press the shortcut, speak, then stop recording.
- The transcription is copied to the clipboard, and can optionally be pasted automatically.
Long recordings are split into chunks and processed in parallel.
- Copy text you want to edit.
- Press the Dictate Prompt shortcut.
- Speak an instruction, such as "turn this into bullet points".
- The edited result is copied to the clipboard.
Open the chat window from the menu bar or its configured shortcut. Chat sessions are stored locally and can use slash commands such as /new, /model, /screenshot, /settings, /grok, /gemini, /connect-google, /disconnect-google, and /meeting.
Gemini models use your Google API key. Grok models use your xAI API key.
Use /meeting in chat or the Meeting shortcut to start and stop live meeting recording. Audio is rotated into chunks, transcribed, and appended to the meeting transcript. Saved transcripts live in the app's Application Support folder.
git clone https://github.com/mgsgde/whisper-shortcut.git
cd whisper-shortcut
bash install.shFor development, build and restart the app with:
bash scripts/rebuild-and-restart.shUseful scripts:
scripts/rebuild-and-restart.sh: Build Debug and restart the local app.scripts/logs.sh: Stream or filter app logs.scripts/create-release.sh: Create a tagged release.scripts/test-gemini-models.sh: Check configured Gemini model availability.
WhisperShortcut/: Swift source for the macOS app.WhisperShortcut.xcodeproj/: Xcode project and shared schemes.scripts/: Local development and release helper scripts..github/workflows/release.yml: GitHub Actions workflow for signed, notarized release builds.plans/: Shared implementation plans and specs.
Core files:
AppState.swift: Central app state machine.MenuBarController.swift: Main app orchestrator.SpeechService.swift: Dictation and prompt workflow logic.ChatView.swift: Chat UI and view model.ChatTools.swift: Local tool registry for chat integrations.TranscriptionModels.swift: Gemini and Whisper transcription model definitions.Settings/: Settings UI, defaults, and persistence.
WhisperShortcut stores settings, chat sessions, meeting transcripts, usage logs, and downloaded models on your Mac. API keys and OAuth refresh tokens are stored in Keychain.
WhisperShortcut uses one canonical app data location so sandboxed and non-sandboxed builds see the same files:
~/Library/Containers/com.magnusgoedde.whispershortcut/Data/Library/Application Support/WhisperShortcut/
See docs/data-directories.md for details.
MIT License. See LICENSE for details.