Skip to content

feat: simplified/traditional chinese preference for transcription#92

Closed
mvanhorn wants to merge 236 commits into
matthartman:mainfrom
mvanhorn:feat/68-chinese-script-preference
Closed

feat: simplified/traditional chinese preference for transcription#92
mvanhorn wants to merge 236 commits into
matthartman:mainfrom
mvanhorn:feat/68-chinese-script-preference

Conversation

@mvanhorn

Copy link
Copy Markdown
Contributor

Summary

  • add a Simplified/Traditional Chinese preference picker in Settings → Cleanup
  • post-process every transcription via CFStringTransform (ICU, no new dep) before cleanup runs
  • default auto is a no-op, so existing installs are unchanged

Why this fixes #68

Whisper's multilingual model emits Traditional Chinese characters even when the user is dictating Simplified. The cleanup-prompt instructions you tried did not help because the LLM cleanup runs on top of an already-Traditional transcript and won't reliably rewrite every character. The reproducer in the issue was:

What I said: 你到底有沒有在聽我講
What appeared: 你到底有沒有在聽我講 (Traditional)
Expected: 你到底有没有在听我讲 (Simplified)

The fix runs the converter as a post-processing pass after Whisper returns and BEFORE the LLM cleanup pipeline starts, so cleanup sees the user's preferred script. Apple's CFStringTransform("Traditional-Simplified") ships the full ICU mapping table, so this is one line of conversion code rather than a hand-built character map.

Changes

  • New GhostPepper/Cleanup/ChineseScriptConverter.swift:
    • enum ChineseScriptPreference { case auto, simplified, traditional } — default is auto, which is a no-op.
    • ChineseScriptConverter.convert(_:to:) wraps CFStringTransform with the right identifier per direction. Returns the input unchanged on transform failure.
  • GhostPepper/AppState.swift:
    • @AppStorage("chineseScriptPreference") var chineseScriptPreferenceRaw plus a typed accessor chineseScriptPreference: ChineseScriptPreference.
    • In cleanedTranscriptionResult(_:windowContext:), text = ChineseScriptConverter.convert(text, to: chineseScriptPreference) after the override guard, before the cleanupEnabled early return. So conversion happens whether or not LLM cleanup runs.
  • GhostPepper/UI/SettingsWindow.swift: a new SettingsCard("Chinese script") Picker inside cleanupSection, binding appState.chineseScriptPreference to the three enum cases.
  • GhostPepper.xcodeproj/project.pbxproj: registers the two new files (project uses XcodeGen, but I made the pbxproj edits directly so consumers without xcodegen can build immediately).

Test plan

  • New GhostPepperTests/ChineseScriptConverterTests.swift with five cases:

    • auto is a no-op
    • the issue's exact Traditional input converts to the issue's expected Simplified output
    • reverse direction round-trips
    • non-Chinese text is unchanged in both directions
    • empty string round-trips
  • xcodebuild -project GhostPepper.xcodeproj -scheme GhostPepper -derivedDataPath build/test-derived -skipMacroValidation CODE_SIGNING_ALLOWED=NO -only-testing:GhostPepperTests/ChineseScriptConverterTests test → ** TEST SUCCEEDED **

  • xcodebuild ... build → ** BUILD SUCCEEDED **

(There are 5 pre-existing failing tests on main related to testAppStateArchives* and testFluidAudioRecordingUsesSpeakerFilteringSession — unrelated to this change; they reproduce on upstream/main without this diff.)

Closes #68

matthartman and others added 30 commits March 23, 2026 14:53
Models section shows each model with loaded/not loaded status.
Taller settings window (580px) to fit all sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Shows when any model isn't loaded. Downloads WhisperKit and/or
cleanup models directly from Settings — no need to re-run onboarding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM.swift's bundled llama.cpp doesn't support Qwen3/3.5 architecture.
Reverted to Qwen 2.5 1.5B + 3B which work reliably.
Will upgrade when LLM.swift updates its llama.cpp.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Info.plist was still at v1.1 so Sparkle thought the "update"
was the same version. Now properly at v1.3 build 4.
Fixes #2.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Improve post-paste learning via Accessibility
Picker in Settings under Input section. Switching models triggers
re-download and reload. Default remains small.en for accuracy.
tiny.en is ~75 MB and much faster for shorter recordings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oggle)

Matches the original Ghost Pepper behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Input Monitoring prompt doesn't reliably show the system dialog
for debug-signed apps. Now attempts to start the hotkey monitor
even without it — Accessibility alone is sufficient for Control key.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixed codesign verification failure reported in #4.
Now verifies signature after extracting from DMG before release.

Also includes: speech model picker, default Control shortcuts,
non-blocking Input Monitoring check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
macOS kills the app after Screen Recording is granted but doesn't
relaunch it. Now spawns a background process that reopens the app
after 3 seconds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace no-audio modal with status pill
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses #5 — users should know about local transcript log
and auto-launch behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
File-based logging was replaced by in-memory DebugLogStore.
Nothing is written to disk. Updated disclosure accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
No longer blocks Continue button. Shown as "(optional)" with
a bordered (not prominent) Enable button. Users can skip it
and enable later in Settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Matches WhisperFlow and other dictation tools.
Toggle is Right Command + Right Option + Space.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Supports Spanish and 90+ other languages. Same size as small.en.
Users can tweak the cleanup prompt for their language's filler words.
Addresses #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
obra and others added 27 commits April 17, 2026 17:26
Reduce transcription latency and improve speaker tagging
…idebar, detect meeting name & attendees

- Inline summary prompt editor with model picker on Summary tab (click "Customize")
- "Imported from Granola" badge on imported meeting files
- Granola files show green import icon in sidebar, no speaker badges on transcript
- Resizable sidebar (drag divider, 160-400px range)
- Sidebar open by default when Meetings window opens
- "Detect" button grabs meeting name from Zoom/Teams window title + OCR attendees
- Auto-detect scans known meeting apps even for manual recordings
- Attendee OCR retries at 3s, 15s, 30s, 60s to catch late joiners
- Attendees accumulate across retries (no duplicates)
- Title auto-update only marks as done on success (retries on failure)
- "Zoom Meeting" cleaned from window titles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rowser meetings

- Test dictation now uses selected mic (was always using system default)
- Settings mic picker initializes from saved selection, not system default
- Detect button activates meeting app (including browsers) before OCR
- Added browser bundle IDs (Brave, Chrome, Arc, Safari, Firefox) for Google Meet detection
- Falls back to frontmost non-Ghost Pepper app if no known meeting app found
- Attendee chips UI instead of plain text
- Debug logging for attendee OCR capture
- Cleaned "Zoom Meeting" from window titles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- One-time migration: enables meetingTranscriptEnabled for existing users
  (detects update by checking if selectedCleanupModelKind exists but
  meetingTranscriptEnabled was never set)
- Shows "What's New" NSAlert on first launch after update with
  "Open Meetings" and "Got It" buttons
- Persists hasSeenMeetingTranscriptAnnouncement to only show once
- Fix: Settings mic picker initializes from saved selection, not system default
- Fix: test dictation targets selected mic device
- Fix: prioritize native Zoom over browsers for Detect button
- Fix: filter out ZM_ internal Zoom windows from title detection
- Fix: strip pronouns from attendee names (she/her, he/him, they/them)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- UpdaterController checks appcast XML 15s after launch to compare versions
- Menu bar dropdown shows 'Update Available — Install Now' in orange when update found
- Sparkle still handles the actual update dialog and installation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ection

Fix Whisper auto-detect defaulting to English for non-English speech
- OAuth 2.0 PKCE flow for desktop apps (no client secret needed)
- Reads calendar.events.readonly scope — just event titles, attendees, times
- When recording starts, checks calendar for current/upcoming event and
  auto-populates meeting title and attendee list
- Calendar data takes priority over OCR detection (more reliable)
- "Connect Google Calendar" button in Settings > Meeting Transcript
- Token stored locally in UserDefaults, refreshes automatically
- URL scheme handler for OAuth callback (com.github.matthartman.ghostpepper://)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lendar

Security improvement: removed the local HTTP server (127.0.0.1:8089) that
briefly opened a port during OAuth sign-in. Now uses Google's OOB redirect
flow — user copies the auth code from Google and pastes it into Settings.

No open ports, no network exposure, same result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Whisper's multilingual model emits Traditional characters even when
the user is dictating in Simplified Chinese, and adding instructions
to the LLM cleanup prompt didn't fix it (#68 trace).

Adds a per-user preference with three options:

- auto      no conversion, default, existing behavior
- simplified converts every transcription via CFStringTransform
            "Traditional-Simplified" before cleanup runs
- traditional same converter, opposite direction

Apple's CFStringTransform is ICU-backed so the full mapping table
ships in the OS — no third-party Swift package needed. The user's
exact reproducer round-trips correctly:

  Input:  "你到底有沒有在聽我講"   (Whisper output)
  Output: "你到底有没有在听我讲"   (after .simplified conversion)

The preference is exposed in Settings → Cleanup as a Picker, stored
in AppStorage as "chineseScriptPreference", and applied in
AppState.cleanedTranscriptionResult before the optional LLM cleanup
step. Default is `auto` so existing installs see no change until a
user opts in.

Tests cover all three options including the issue's verbatim string,
non-Chinese text passthrough, and empty-string edge case. All five
tests pass via:

  xcodebuild ... -only-testing:GhostPepperTests/ChineseScriptConverterTests test

Closes #68
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Chinese speech outputs Traditional characters instead of Simplified

6 participants