Add text input mode and bring-your-own-key API support by uziiuzair · Pull Request #80 · farzaa/clicky

uziiuzair · 2026-05-03T13:18:04Z

Summary

Text input mode — a floating green chat bubble (ChatInputBubbleManager / ChatInputBubbleView) summoned near the cursor when "text" is selected in the menu bar panel. The same ctrl + option hotkey toggles between voice dictation and the text bubble. Responses render in a dedicated top-right StreamingResponsePanel (its own NSPanel, one per screen) rather than on the click-through cursor overlay, so the close button and follow-up input field actually receive clicks. Both voice and text paths terminate at the existing sendTranscriptToClaudeWithScreenshot pipeline — screenshot capture, Claude streaming, TTS, and [POINT:] pointing all behave identically.
Processing shimmer — full-screen Apple-Intelligence-style edge glow rendered during AI processing (ProcessingShimmerManager / ProcessingShimmerView). Click-through (ignoresMouseEvents = true), sits at .screenSaver level alongside the cursor overlay, one panel per connected display. Visible only while voice state is .processing, hidden the rest of the time.
Bring-your-own-key (BYOK) API support — provider picker in the menu bar panel: Clicky proxy (default, no setup), user-supplied Anthropic key, or user-supplied OpenAI key. TTS and voice transcription still route through the Worker in all modes. User keys are persisted to the macOS Keychain (new KeychainHelper.swift) — never UserDefaults.
Cursor summon-on-demand — the blue cursor overlay is now hidden by default and only revealed during voice interactions, onboarding, and when Claude returns a [POINT:x,y:label:screenN] tag. Overlay panels stay alive across visibility transitions to avoid races with multi-monitor coordinate mapping.

AGENTS.md is updated to document the input-mode picker, cursor visibility model, and the new files.

Test plan

Voice mode: hold ctrl + option, speak, release — waveform appears, transcript is sent, Claude responds with TTS playback
Text mode: tap ctrl + option — green bubble appears near the cursor, type a prompt, press return — response streams into the top-right StreamingResponsePanel; close button and the follow-up input row both work
Processing shimmer: edge glow appears on every connected display while the AI is processing, disappears the moment streaming text starts, and clicks pass through it to underlying windows
Cursor visibility: cursor stays hidden during a normal chat response and only flies out for [POINT:...] responses
BYOK Anthropic: paste a real Anthropic key in the panel, send a prompt — request goes direct to api.anthropic.com
BYOK OpenAI: paste a real OpenAI key, send a prompt — response uses a GPT model
Persistence: quit and relaunch — input mode, provider choice, and stored keys all survive (mode/provider via UserDefaults; keys via Keychain)
Multi-monitor: trigger a [POINT:] response on a non-primary screen — cursor flies to the correct monitor

🤖 Generated with Claude Code

- Floating chat bubble (ChatInputBubbleManager / ChatInputBubbleView) shown near the cursor when input mode is .text. The same ctrl+option hotkey toggles dictation in voice mode or summons the bubble in text mode, and both paths terminate at the existing screenshot → Claude → TTS → pointing pipeline. - Hide the blue cursor overlay by default; reveal it only during voice interactions, onboarding, and when Claude returns a [POINT:...] tag. - Provider picker: route chat through the Clicky Worker proxy (default), the user's own Anthropic key, or the user's own OpenAI key. TTS and voice transcription still go through the proxy in all modes. - Store user-supplied API keys in the macOS Keychain via the new KeychainHelper instead of UserDefaults. - Document the new input modes, cursor visibility model, and added files in AGENTS.md.

- Move the top-right text-mode chat box out of the click-through cursor overlay (OverlayWindow) and into its own NSPanel (StreamingResponsePanelManager / StreamingResponsePanelView). The cursor overlay is ignoresMouseEvents=true, which was swallowing clicks on the close button and follow-up input field. - Add ProcessingShimmerManager / ProcessingShimmerView — a full-screen click-through edge-glow shimmer rendered during AI processing for Apple-Intelligence-style visual feedback. One panel per connected display, sits at .screenSaver level, visible only while the AI is processing. - Adjust ChatInputBubble, CompanionManager, and screen capture wiring so voice/text-state transitions drive the new panels and shimmer.

uziiuzair added 2 commits May 3, 2026 18:14

uziiuzair marked this pull request as ready for review May 3, 2026 21:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add text input mode and bring-your-own-key API support#80

Add text input mode and bring-your-own-key API support#80
uziiuzair wants to merge 2 commits intofarzaa:mainfrom
uziiuzair:feature/text-input-and-byok-keys

uziiuzair commented May 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

uziiuzair commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

uziiuzair commented May 3, 2026 •

edited

Loading