orellius-stt

Hold a key, speak Hebrew, release - clean English lands wherever your cursor is. No UI interaction. No clipboard round-trip. No window switching.

Built for native Hebrew speakers who think in Hebrew and need to type in English all day. The OS voice-typing tools don't handle Hebrew well, and flipping keyboard layouts kills flow. This fixes that.

Important

macOS only, Apple Silicon (arm64) required. The HID injection layer uses CGEventTap which is macOS-specific; the Rust native module compiles for arm64 only.

How it works

hotkey down  →  audio capture (16 kHz mono, Web Audio API)
hotkey up    →  whisper-rs transcribes Hebrew locally (ivrit-ai model)
             →  Ollama translates Hebrew → English (local, no cloud)
             →  filler words stripped (אממ, כאילו, etc.)
             →  dictionary rewrites applied
             →  text injected via CGEvent HID tap into focused app

A 340×56 floating pill shows state - idle / recording / transcribing / translating / result / error - with a real-time equalizer during recording. Tray icon color mirrors the state.

Works in every app, including ones that block synthetic input, because injection goes through the HID layer.

Stack

Layer	Technology
Shell	Electron 41, TypeScript
UI	React 19 + Vite
Hotkey	napi-rs (Rust), `CGEventTap` at session level
Paste	napi-rs + `CGEvent.set_string()` + `post(HID)`
STT	whisper-rs + ivrit-ai/whisper-large-v3-turbo-ggml
Translation	Ollama - local LLM, no cloud
Packaging	electron-builder, arm64 DMG

Requirements

macOS 11+, Apple Silicon (arm64)
Ollama running on localhost:11434
A translation model pulled - default: gemma4:26b-a4b-it-q8_0 (configurable in settings)

Note

On first run, the Whisper model (~3 GB) is downloaded automatically via huggingface_hub. Set WHISPER_MODEL_PATH to point to a local copy and skip the download.

Permissions needed on first run:

Microphone
Accessibility (for the hotkey tap and HID paste)

Quick Start

# Install dependencies
bun install

# Build the Rust native module (first time only - takes ~1 min)
bun run build:native

# Start in dev mode (hot reload)
bun run dev

Build a release DMG

bun run build    # native + renderer + main
bun run dist     # → release/orellius-stt-0.1.0-arm64.dmg

Typecheck

bun run typecheck

Configuration

All config lives in ~/.config/orellius/:

File	Contents
`settings.json`	hotkey key, autoPaste toggle, languageMode (`he-en` / `en-he`)
`history.json`	past transcriptions + translations
`dictionary.json`	custom term rewrites (managed from Settings UI)

Tip

To use a lighter model for faster translation at the cost of quality, set OLLAMA_MODEL=qwen2.5:3b before launching. Latency drops to ~0.5s on a 3B model vs ~2–3s on 26B.

Changing the Whisper model

The default model path follows the standard huggingface_hub cache layout:

~/.cache/huggingface/hub/models--ivrit-ai--whisper-large-v3-turbo-ggml/

To use a different model, update the path in Settings or set WHISPER_MODEL_PATH before launching.

Supported modes

Mode	Direction
`he-en` (default)	Hebrew speech → English text
`en-he`	English speech → Hebrew text

Toggle in the Settings page inside the app.

Architecture

orellius-stt/
├── electron/
│   ├── main.ts                   Tray, windows, IPC, native bindings
│   └── services/
│       ├── ollama.ts             Translation HTTP + filler stripping + dictionary
│       ├── settings.ts           Settings loader/saver
│       ├── history.ts            Transcription history persistence
│       ├── dictionary.ts         Custom rewrite rules
│       └── logs.ts               In-app log capture
├── native-core/                  Rust napi-rs module
│   └── src/
│       ├── lib.rs                whisper-rs wrapper
│       ├── hotkey.rs             CGEventTap (right-cmd / ctrl / F5 / F6)
│       └── paste.rs              HID injection via CGEvent
└── renderer/
    └── src/
        ├── App.tsx               State machine (idle→recording→…→result)
        ├── audio.ts              16 kHz mono resample
        └── pages/                Settings, History, Dictionary, Overview

Performance

Tested on Mac Studio M4 Max 64GB:

Step	Latency
End-to-end (short phrase)	~2–5 s
Model cold start	~1–2 s (cached after first use)
Translation (3B–7B model)	~0.5–1 s
Translation (26B model)	~2–3 s

Contributing

Issues and PRs welcome. The Rust native module (native-core/) is the most interesting part - CGEventTap + HID injection is not well-documented and any improvements there are valuable.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
build		build
electron		electron
native-core		native-core
renderer		renderer
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.main.json		tsconfig.main.json
tsconfig.renderer.json		tsconfig.renderer.json
vite.config.mts		vite.config.mts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

orellius-stt

How it works

Stack

Requirements

Quick Start

Build a release DMG

Typecheck

Configuration

Changing the Whisper model

Supported modes

Architecture

Performance

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

orellius-stt

How it works

Stack

Requirements

Quick Start

Build a release DMG

Typecheck

Configuration

Changing the Whisper model

Supported modes

Architecture

Performance

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages