Skip to content

TalentBoys/VoiceBridgeApp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 

Repository files navigation

VoiceBridge

Real-time voice translation desktop app powered by SAP AI Core Proxy.

中文文档

Features

  • Real-time Translation — Stream audio to GPT Realtime API via WebSocket, get instant transcription and translation
  • Dual Audio Capture — Physical microphone (Web Audio API) or virtual audio devices like BlackHole (native cpal)
  • Text-to-Speech — Optionally hear translations spoken aloud with configurable voices
  • Meeting Recording — Automatically save meeting audio as WAV files with playback in archive
  • AI Summary — Generate meeting summaries with key points and action items
  • Multi-language — Supports English, Chinese, Japanese, Korean, German, French, Spanish
  • i18n Interface — App UI available in English, 简体中文, and 日本語
  • Speaker Identification — Auto-detect and label different speakers, with rename support
  • Learning Mode — Upload audio files for transcription and translation practice

Tech Stack

Layer Technology
Desktop Framework Tauri v2 + Rust
Frontend React 19 + TypeScript + Tailwind CSS + shadcn/ui
State Management Zustand
WebSocket tokio-tungstenite (Rust) for GPT Realtime API
Audio Capture cpal (Rust) for native devices, Web Audio API for physical mics
Audio Recording hound (Rust) for WAV file output
API Proxy SAP AI Core Proxy (http://localhost:3000)

Architecture

src-tauri/                  # Rust backend
├── src/
│   ├── commands/           # Tauri command handlers (meeting, realtime, tts, etc.)
│   ├── services/           # Core services (realtime WS, audio I/O, recorder, API client)
│   └── state/              # Application state (settings, meeting store)
└── Cargo.toml

src/renderer/               # React frontend
├── src/
│   ├── components/         # UI components (meeting, archive, settings, learning)
│   ├── hooks/              # Custom hooks (audio capture, devices, playback)
│   ├── stores/             # Zustand stores
│   ├── i18n/               # Internationalization (zh-CN, en, ja)
│   ├── lib/                # Utilities, constants, Tauri bridge
│   └── types/              # TypeScript type definitions
└── index.html

Prerequisites

  • Node.js >= 18
  • Rust (stable toolchain)
  • Tauri CLI (npm install -g @tauri-apps/cli)
  • SAP AI Core Proxy running on http://localhost:3000

Getting Started

# Install dependencies
npm install

# Start development
npm run tauri:dev

# Build for production
npm run tauri:build
# Output: src-tauri/target/release/bundle/dmg/*.dmg (macOS)

# Type check
npm run typecheck

Audio Setup

VoiceBridge supports two audio capture modes:

  1. Physical Microphone — Works out of the box. The browser captures audio via getUserMedia.
  2. Virtual Audio Device — For capturing system or application audio (e.g., meeting apps). Install BlackHole 2ch and select it in Settings > Audio > Input Device.

Recording Both Sides in a Meeting

When using a virtual audio device (e.g., BlackHole) to capture a Teams/Zoom meeting, only the remote participants' audio is recorded — your own voice goes directly from your microphone to the meeting app and is not routed through the virtual device.

To capture both sides of the conversation, create an Aggregate Device on macOS:

  1. Open Audio MIDI Setup (search in Spotlight)
  2. Click + at the bottom left → Create Aggregate Device
  3. Check both BlackHole 2ch and your microphone (built-in mic, USB mic, or Bluetooth headset)
  4. In VoiceBridge Settings > Audio > Input Device, select this Aggregate Device

Note: If you switch between microphones (e.g., built-in mic ↔ Bluetooth headset), you need to update the Aggregate Device to include the currently active microphone.

Configuration

All settings are configured through the in-app Settings page:

Setting Description
API URL SAP AI Core Proxy endpoint (default: http://localhost:3000)
API Key Optional Bearer token for remote endpoints
Source / Target Language Translation language pair
Audio Input Device Physical mic or virtual device
TTS Enable/disable, voice selection
Recording Enable/disable, custom save directory
Theme Light / Dark / System
Display Language Auto / 中文 / English / 日本語

Settings are stored at ~/Library/Application Support/com.voicebridge.app/settings.json (macOS), completely outside the project directory.

Data Storage

Data Location
Settings ~/Library/Application Support/com.voicebridge.app/settings.json
Meeting Records ~/Library/Application Support/com.voicebridge.app/meetings/*.json
Audio Recordings Same directory as meetings (or custom directory from Settings)

License

MIT

Author

Darry (darry.chen@sap.com)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors