Skip to content

feat: integrate speech recognition functionality in InterviewChat component#4

Merged
CUinspace233 merged 2 commits intomainfrom
speech_to_text
Mar 6, 2026
Merged

feat: integrate speech recognition functionality in InterviewChat component#4
CUinspace233 merged 2 commits intomainfrom
speech_to_text

Conversation

@CUinspace233
Copy link
Copy Markdown
Owner

  • Added a custom hook useSpeechRecognition to manage speech recognition features, including start/stop recording and handling transcripts.
  • Implemented voice input support in the InterviewChat component with a microphone button, enhancing user interaction.
  • Updated the answer input field to reflect recording status and adjusted styles for better visual feedback.
  • Ensured recording stops automatically when the answer area is hidden, improving usability.

…ponent

- Added a custom hook `useSpeechRecognition` to manage speech recognition features, including start/stop recording and handling transcripts.
- Implemented voice input support in the InterviewChat component with a microphone button, enhancing user interaction.
- Updated the answer input field to reflect recording status and adjusted styles for better visual feedback.
- Ensured recording stops automatically when the answer area is hidden, improving usability.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds speech-to-text voice input support to the InterviewChat component. It introduces a custom useSpeechRecognition hook that wraps the Web Speech API, and integrates it with a microphone toggle button in the answer input area. The feature allows users to dictate their interview answers instead of typing them.

Changes:

  • New useSpeechRecognition hook managing the Web Speech API lifecycle (start/stop, interim results, auto-restart on Chrome silence, error handling).
  • Microphone button added to the InterviewChat answer area with visual feedback (pulse animation, red border on textarea while recording, dynamic placeholder).
  • Recording is auto-stopped when the answer area hides or when an answer is sent, ensuring clean state transitions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
frontend/src/hooks/useSpeechRecognition.ts New custom hook encapsulating Web Speech API for start/stop recording, transcript handling, and error management
frontend/src/components/InterviewChat.tsx Integrates the speech recognition hook with a mic button, visual recording indicators, and auto-stop behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +50 to +115
new (): SpeechRecogInstance;
}

interface WindowWithSpeech {
SpeechRecognition?: SpeechRecogConstructor;
webkitSpeechRecognition?: SpeechRecogConstructor;
}

const getWindow = () => typeof window !== "undefined" ? window as unknown as WindowWithSpeech : undefined;

const isSupported = !!getWindow()?.SpeechRecognition || !!getWindow()?.webkitSpeechRecognition;

export function useSpeechRecognition({
language,
onTranscript,
onError,
getCurrentText,
}: SpeechRecognitionOptions): SpeechRecognitionHook {
const [isRecording, setIsRecording] = useState(false);
const recognitionRef = useRef<SpeechRecogInstance | null>(null);
const isRecordingRef = useRef(false);
const baseTextRef = useRef("");

const stopRecording = useCallback(() => {
isRecordingRef.current = false;
setIsRecording(false);
if (recognitionRef.current) {
recognitionRef.current.onend = null;
recognitionRef.current.onerror = null;
recognitionRef.current.onresult = null;
try {
recognitionRef.current.stop();
} catch {
// already stopped
}
recognitionRef.current = null;
}
}, []);

const startRecording = useCallback(() => {
if (!isSupported) return;

const w = getWindow();
const Ctor = w?.SpeechRecognition ?? w?.webkitSpeechRecognition;
if (!Ctor) return;

const recognition = new Ctor();
recognition.continuous = true;
recognition.interimResults = true;
recognition.lang = language;

baseTextRef.current = getCurrentText();

recognition.onresult = (event) => {
let transcript = "";
for (let i = 0; i < event.results.length; i++) {
const result = event.results[i];
const alt = result?.[0];
if (alt) {
transcript += alt.transcript;
}
}
const separator = baseTextRef.current ? " " : "";
onTranscript(baseTextRef.current + separator + transcript);
};

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale closure bug: The callbacks onTranscript, onError, and getCurrentText are captured in startRecording's useCallback closure, but the recognition.onresult and recognition.onend event handlers are only set once when startRecording runs. If these callbacks change identity on re-render (which they do, since they're inline arrow functions at the call site), the event handlers will use stale versions.

Most critically, getCurrentText on line 81 (inside onend) will return a stale currentAnswer value — the value at the time recording started, not the current value after transcription updates. This means when Chrome auto-restarts recognition after silence, baseTextRef.current will be set to the wrong text, causing transcript duplication or data loss.

The existing useDebounce hook in this codebase (see frontend/src/hooks/useDebounce.ts:5-6) solves this exact problem by using the ref pattern: store callbacks in refs and update them on every render, so that event handlers always invoke the latest callback. The same pattern should be applied here — store onTranscript, onError, and getCurrentText in refs (e.g., const onTranscriptRef = useRef(onTranscript); onTranscriptRef.current = onTranscript;) and call through the ref inside event handlers.

Copilot uses AI. Check for mistakes.
- Updated the `useSpeechRecognition` hook to store callbacks in refs, ensuring the latest versions are used in event handlers.
- Improved performance by reducing dependencies in the `useEffect` hook, enhancing the overall efficiency of the speech recognition functionality.
@CUinspace233 CUinspace233 merged commit d018cc9 into main Mar 6, 2026
2 checks passed
@CUinspace233 CUinspace233 deleted the speech_to_text branch March 6, 2026 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants