Hold a key, speak Hebrew, release - clean English lands wherever your cursor is. No UI interaction. No clipboard round-trip. No window switching.
Built for native Hebrew speakers who think in Hebrew and need to type in English all day. The OS voice-typing tools don't handle Hebrew well, and flipping keyboard layouts kills flow. This fixes that.
Important
macOS only, Apple Silicon (arm64) required. The HID injection layer uses CGEventTap which is macOS-specific; the Rust native module compiles for arm64 only.
hotkey down → audio capture (16 kHz mono, Web Audio API)
hotkey up → whisper-rs transcribes Hebrew locally (ivrit-ai model)
→ Ollama translates Hebrew → English (local, no cloud)
→ filler words stripped (אממ, כאילו, etc.)
→ dictionary rewrites applied
→ text injected via CGEvent HID tap into focused app
A 340×56 floating pill shows state - idle / recording / transcribing / translating / result / error - with a real-time equalizer during recording. Tray icon color mirrors the state.
Works in every app, including ones that block synthetic input, because injection goes through the HID layer.
| Layer | Technology |
|---|---|
| Shell | Electron 41, TypeScript |
| UI | React 19 + Vite |
| Hotkey | napi-rs (Rust), CGEventTap at session level |
| Paste | napi-rs + CGEvent.set_string() + post(HID) |
| STT | whisper-rs + ivrit-ai/whisper-large-v3-turbo-ggml |
| Translation | Ollama - local LLM, no cloud |
| Packaging | electron-builder, arm64 DMG |
- macOS 11+, Apple Silicon (arm64)
- Ollama running on
localhost:11434 - A translation model pulled - default:
gemma4:26b-a4b-it-q8_0(configurable in settings)
Note
On first run, the Whisper model (~3 GB) is downloaded automatically via huggingface_hub. Set WHISPER_MODEL_PATH to point to a local copy and skip the download.
Permissions needed on first run:
- Microphone
- Accessibility (for the hotkey tap and HID paste)
# Install dependencies
bun install
# Build the Rust native module (first time only - takes ~1 min)
bun run build:native
# Start in dev mode (hot reload)
bun run devbun run build # native + renderer + main
bun run dist # → release/orellius-stt-0.1.0-arm64.dmgbun run typecheckAll config lives in ~/.config/orellius/:
| File | Contents |
|---|---|
settings.json |
hotkey key, autoPaste toggle, languageMode (he-en / en-he) |
history.json |
past transcriptions + translations |
dictionary.json |
custom term rewrites (managed from Settings UI) |
Tip
To use a lighter model for faster translation at the cost of quality, set OLLAMA_MODEL=qwen2.5:3b before launching. Latency drops to ~0.5s on a 3B model vs ~2–3s on 26B.
The default model path follows the standard huggingface_hub cache layout:
~/.cache/huggingface/hub/models--ivrit-ai--whisper-large-v3-turbo-ggml/
To use a different model, update the path in Settings or set WHISPER_MODEL_PATH before launching.
| Mode | Direction |
|---|---|
he-en (default) |
Hebrew speech → English text |
en-he |
English speech → Hebrew text |
Toggle in the Settings page inside the app.
orellius-stt/
├── electron/
│ ├── main.ts Tray, windows, IPC, native bindings
│ └── services/
│ ├── ollama.ts Translation HTTP + filler stripping + dictionary
│ ├── settings.ts Settings loader/saver
│ ├── history.ts Transcription history persistence
│ ├── dictionary.ts Custom rewrite rules
│ └── logs.ts In-app log capture
├── native-core/ Rust napi-rs module
│ └── src/
│ ├── lib.rs whisper-rs wrapper
│ ├── hotkey.rs CGEventTap (right-cmd / ctrl / F5 / F6)
│ └── paste.rs HID injection via CGEvent
└── renderer/
└── src/
├── App.tsx State machine (idle→recording→…→result)
├── audio.ts 16 kHz mono resample
└── pages/ Settings, History, Dictionary, Overview
Tested on Mac Studio M4 Max 64GB:
| Step | Latency |
|---|---|
| End-to-end (short phrase) | ~2–5 s |
| Model cold start | ~1–2 s (cached after first use) |
| Translation (3B–7B model) | ~0.5–1 s |
| Translation (26B model) | ~2–3 s |
Issues and PRs welcome. The Rust native module (native-core/) is the most interesting part - CGEventTap + HID injection is not well-documented and any improvements there are valuable.
MIT