Add text input mode and bring-your-own-key API support#80
Open
uziiuzair wants to merge 2 commits intofarzaa:mainfrom
Open
Add text input mode and bring-your-own-key API support#80uziiuzair wants to merge 2 commits intofarzaa:mainfrom
uziiuzair wants to merge 2 commits intofarzaa:mainfrom
Conversation
- Floating chat bubble (ChatInputBubbleManager / ChatInputBubbleView) shown near the cursor when input mode is .text. The same ctrl+option hotkey toggles dictation in voice mode or summons the bubble in text mode, and both paths terminate at the existing screenshot → Claude → TTS → pointing pipeline. - Hide the blue cursor overlay by default; reveal it only during voice interactions, onboarding, and when Claude returns a [POINT:...] tag. - Provider picker: route chat through the Clicky Worker proxy (default), the user's own Anthropic key, or the user's own OpenAI key. TTS and voice transcription still go through the proxy in all modes. - Store user-supplied API keys in the macOS Keychain via the new KeychainHelper instead of UserDefaults. - Document the new input modes, cursor visibility model, and added files in AGENTS.md.
- Move the top-right text-mode chat box out of the click-through cursor overlay (OverlayWindow) and into its own NSPanel (StreamingResponsePanelManager / StreamingResponsePanelView). The cursor overlay is ignoresMouseEvents=true, which was swallowing clicks on the close button and follow-up input field. - Add ProcessingShimmerManager / ProcessingShimmerView — a full-screen click-through edge-glow shimmer rendered during AI processing for Apple-Intelligence-style visual feedback. One panel per connected display, sits at .screenSaver level, visible only while the AI is processing. - Adjust ChatInputBubble, CompanionManager, and screen capture wiring so voice/text-state transitions drive the new panels and shimmer.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ChatInputBubbleManager/ChatInputBubbleView) summoned near the cursor when "text" is selected in the menu bar panel. The samectrl + optionhotkey toggles between voice dictation and the text bubble. Responses render in a dedicated top-rightStreamingResponsePanel(its ownNSPanel, one per screen) rather than on the click-through cursor overlay, so the close button and follow-up input field actually receive clicks. Both voice and text paths terminate at the existingsendTranscriptToClaudeWithScreenshotpipeline — screenshot capture, Claude streaming, TTS, and[POINT:]pointing all behave identically.ProcessingShimmerManager/ProcessingShimmerView). Click-through (ignoresMouseEvents = true), sits at.screenSaverlevel alongside the cursor overlay, one panel per connected display. Visible only while voice state is.processing, hidden the rest of the time.KeychainHelper.swift) — never UserDefaults.[POINT:x,y:label:screenN]tag. Overlay panels stay alive across visibility transitions to avoid races with multi-monitor coordinate mapping.AGENTS.mdis updated to document the input-mode picker, cursor visibility model, and the new files.Test plan
ctrl + option, speak, release — waveform appears, transcript is sent, Claude responds with TTS playbackctrl + option— green bubble appears near the cursor, type a prompt, press return — response streams into the top-rightStreamingResponsePanel; close button and the follow-up input row both work[POINT:...]responsesapi.anthropic.com[POINT:]response on a non-primary screen — cursor flies to the correct monitor🤖 Generated with Claude Code