Skip to content

ancroo/ancroo-voice

Repository files navigation

Ancroo Voice — STT Client for Ancroo Stack

License: MIT Python 3.11 CustomTkinter Status: Beta

Push-to-talk speech-to-text client for the Ancroo Stack. Hold a hotkey, speak, release - text appears at your cursor.

Transcription is managed centrally by the Ancroo Backend — the client just sends audio. Lightweight binary for Linux and Windows, no local GPU required.

Phase 0 (Beta) — Ancroo Voice is functional for local use, but the backend it connects to runs without encryption or authentication by default and is still under active development. Intended for local/trusted networks only. See the Ancroo Roadmap for the security path forward.

Ancroo Voice GUI

How It Works

%%{init: {'theme': 'neutral'}}%%
graph LR
    mic["Microphone"]
    voice["Ancroo Voice"]
    backend["Ancroo<br/>Backend"]
    stt["Speech to<br/>Text"]

    mic --> voice
    voice <--> backend
    backend <--> stt

    style mic fill:transparent,stroke:transparent,color:#1e3a5f
    style voice fill:#fef08a,stroke:#eab308,color:#713f12
    style backend fill:#d1fae5,stroke:#10b981,color:#064e3b
    style stt fill:#fed7aa,stroke:#f97316,color:#7c2d12
Loading

Features

  • Push-to-Talk: Hold hotkey to record, release to transcribe
  • Backend-Managed STT: Ancroo Backend handles model and server selection centrally
  • Lightweight Binary: Small download, no GPU dependencies
  • Linux + Windows: Pre-built binaries for both platforms
  • Configurable Hotkeys: Any key combination (Ctrl+Space, Alt+R, etc.) with visual hotkey recorder
  • Multi-Language: 10 languages + Auto-Detection
  • Dark/Light Mode: Switch between dark and light themes
  • UI Scaling: Adjustable font size (A-/A+ buttons)
  • Auto-Copy: Optionally copy transcriptions to clipboard automatically
  • GUI Record Button: On-screen record button as alternative to hotkeys (required on Wayland)

Tip: Pair Ancroo Voice with a Stream Deck or foot pedals for one-button dictation and workflow triggers. You can see an example setup in this Article: Supercharge Your AI Workflow: Speech-to-Text with Stream Deck

Download

Download the latest release for your platform:

Platform Download
Windows AncrooVoice-Windows.zip
Linux AncrooVoice-Linux.tar.gz

Quick Start

1. Extract & Run

Windows:

1. Extract ZIP
2. Run AncrooVoice.exe

Note: Windows SmartScreen may show an "Unknown publisher" warning — this is normal for unsigned open-source software. To proceed:

  • Click "More info""Run anyway", or
  • Right-click the .exeProperties → check "Unblock"Apply (removes the warning permanently)

Linux:

tar -xzf AncrooVoice-Linux.tar.gz
./AncrooVoice-Linux.sh

Wayland: Global hotkeys are not supported on Wayland due to security restrictions. Use the on-screen record button instead.

2. Configure Backend Connection

Edit the config file to point to your Ancroo Backend (.env on Linux, ancroo-voice.ini on Windows):

ANCROO_BACKEND_ENDPOINT=http://your-server:8900/api/v1/transcribe

Important: The Ancroo Backend must have at least one active STT provider configured. Use the Admin UI at http://your-server:8900/admin/stt-providers to register your STT server (e.g. Whisper-ROCm, Speaches).

3. Use

  1. Select microphone
  2. Click "Start"
  3. Hold your hotkey (default: Ctrl+Space) and speak
  4. Release - text appears at cursor

Provider

Ancroo Voice connects to the Ancroo Backend, which manages STT model and server selection centrally. The client sends audio and receives transcribed text — no local configuration of STT models needed.

Configuration

Variable Required Description
ANCROO_BACKEND_ENDPOINT Yes Ancroo Backend transcribe URL, e.g. http://your-server:8900/api/v1/transcribe
ANCROO_BACKEND_API_KEY No Bearer token for authenticated backends
ANCROO_BACKEND_VERIFY_SSL No Set to false for self-signed certificates

Note: The endpoint points to the Ancroo Backend (default port 8900), not directly to a Whisper/STT server. The backend handles model and server routing internally.

File Locations

File Purpose Notes
.env / ancroo-voice.ini Backend connection .env on Linux, .ini on Windows
ancroo-voice_config.json GUI settings Auto-saved by the application

Acknowledgments

This project is built with the following open-source software:

Project Purpose License
CustomTkinter GUI framework MIT
pynput Global hotkey listener LGPL-3.0
sounddevice Audio recording MIT
NumPy Audio processing BSD-3-Clause
Pillow Image handling HPND
Requests HTTP client Apache-2.0

Speech-to-text is provided by OpenAI Whisper (MIT) models running on your server via the Ancroo Stack.

Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request.

Security

To report a security vulnerability, please use GitHub's private vulnerability reporting instead of opening a public issue.

License

MIT — see LICENSE. The Ancroo name is not covered by this license and remains the property of the author.

Author

Stefan SchmidbauerGitHub · stefan@ancroo.com


Built with the help of AI (Claude by Anthropic).

About

Desktop STT client for the Ancroo ecosystem (Python, CustomTkinter)

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages