Skip to content

SpeechToText AutostopSilenceTimeout#3111

Open
VladislavAntonyuk wants to merge 7 commits intomainfrom
speech-to-text-autostop-silence-timeout
Open

SpeechToText AutostopSilenceTimeout#3111
VladislavAntonyuk wants to merge 7 commits intomainfrom
speech-to-text-autostop-silence-timeout

Conversation

@VladislavAntonyuk
Copy link
Collaborator

@VladislavAntonyuk VladislavAntonyuk commented Feb 23, 2026

Description of Change

Added AutostopSilenceTimeout, refactored Macios to remove duplicated code.

Linked Issues

PR Checklist

Additional information

@VladislavAntonyuk VladislavAntonyuk self-assigned this Feb 23, 2026
Copilot AI review requested due to automatic review settings February 23, 2026 09:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds an AutoStopSilenceTimeout feature to the SpeechToText functionality, allowing speech recognition to automatically stop after detecting a specified duration of silence. The PR implements this feature across all supported platforms (Windows, Android, iOS/macOS) for both online and offline speech recognition modes.

Changes:

  • Added AutoStopSilenceTimeout property to SpeechToTextOptions with default value of TimeSpan.MaxValue
  • Implemented platform-specific silence timeout handling (native API support on Windows/Android, timer-based on iOS/macOS)
  • Added guard logic to prevent multiple simultaneous listening sessions
  • Refactored iOS/macOS partial results handling from segment-based to full transcript reporting

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SpeechToTextOptions.cs Added AutoStopSilenceTimeout property with TimeSpan.MaxValue as default
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SpeechToTextImplementation.windows.cs Applied AutoStopSilenceTimeout to Windows ContinuousRecognitionSession
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SpeechToTextImplementation.android.cs Added silence timeout extras to Android RecognizerIntent
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SpeechToTextImplementation.shared.cs Added guard check to prevent re-entrant calls
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SpeechToTextImplementation.macos.cs Removed duplicate code, delegated to shared CreateSpeechRecognizerTask
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SpeechToTextImplementation.ios.cs Removed duplicate code, delegated to shared CreateSpeechRecognizerTask
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SharedSpeechToTextImplementation.macios.cs Added timer-based silence detection, refactored recognition task creation, changed audioEngine to readonly non-nullable
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSpeechToTextImplementation.windows.cs Applied AutoStopSilenceTimeout to InitialSilenceTimeout and BabbleTimeout
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSpeechToTextImplementation.shared.cs Added guard check to prevent re-entrant calls
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSpeechToTextImplementation.macos.cs Removed duplicate code, delegated to shared CreateSpeechRecognizerTask
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSpeechToTextImplementation.ios.cs Removed duplicate code, delegated to shared CreateSpeechRecognizerTask
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSpeechToTextImplementation.android.cs Added silence timeout extras to Android RecognizerIntent
src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSharedSpeechToTextImplementation.macios.cs Added timer-based silence detection, refactored recognition task creation, changed audioEngine to readonly non-nullable
Comments suppressed due to low confidence (2)

src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SharedSpeechToTextImplementation.macios.cs:33

  • The silenceTimer is not being disposed in the DisposeAsync method. This is a resource leak because IDispatcherTimer likely implements IDisposable. The timer should be stopped and disposed when the object is disposed.

Add disposal logic for silenceTimer in DisposeAsync, similar to how other resources are disposed.

	public ValueTask DisposeAsync()
	{
		audioEngine.Dispose();
		speechRecognizer?.Dispose();
		liveSpeechRequest?.Dispose();
		recognitionTask?.Dispose();

		speechRecognizer = null;
		liveSpeechRequest = null;
		recognitionTask = null;
		return ValueTask.CompletedTask;

src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSharedSpeechToTextImplementation.macios.cs:32

  • The silenceTimer is not being disposed in the DisposeAsync method. This is a resource leak because IDispatcherTimer likely implements IDisposable. The timer should be stopped and disposed when the object is disposed.

Add disposal logic for silenceTimer in DisposeAsync, similar to how other resources are disposed.

	public ValueTask DisposeAsync()
	{
		audioEngine.Dispose();
		speechRecognizer?.Dispose();
		liveSpeechRequest?.Dispose();
		recognitionTask?.Dispose();

		speechRecognizer = null;
		liveSpeechRequest = null;
		recognitionTask = null;
		return ValueTask.CompletedTask;

Comment on lines +38 to +41
if (CurrentState != SpeechToTextState.Stopped)
{
return;
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard check prevents multiple simultaneous listening sessions by returning early if CurrentState is not Stopped. However, this check is not thread-safe - there's a race condition between checking CurrentState and starting the listening session in InternalStartListening.

If StartListenAsync is called from multiple threads simultaneously, both calls could pass the CurrentState check before either one changes the state, leading to multiple concurrent listening sessions. Consider using a lock or other synchronization mechanism to make this check atomic with the state transition.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Yes, we should add a SemaphoreSlim.

VladislavAntonyuk and others added 2 commits February 23, 2026 11:54
…oTextOptions.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 23, 2026 15:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSharedSpeechToTextImplementation.macios.cs:33

  • silenceTimer is a long-lived field but is never disposed in DisposeAsync(). Since IDispatcherTimer is disposable, this can leak timer resources. Consider stopping, detaching Tick, and disposing the timer during DisposeAsync() (and avoiding re-attaching Tick multiple times if InitSilenceTimer is called repeatedly).
	/// <inheritdoc />
	public ValueTask DisposeAsync()
	{
		audioEngine.Dispose();
		speechRecognizer?.Dispose();
		liveSpeechRequest?.Dispose();
		recognitionTask?.Dispose();

		speechRecognizer = null;
		liveSpeechRequest = null;
		recognitionTask = null;
		return ValueTask.CompletedTask;
	}

src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SharedSpeechToTextImplementation.macios.cs:33

  • silenceTimer is a long-lived field but is never disposed in DisposeAsync(). Since IDispatcherTimer is disposable, this can leak timer resources and keep callbacks alive longer than intended. Consider stopping, detaching Tick, and disposing the timer during DisposeAsync() (and avoiding re-attaching Tick multiple times if InitSilenceTimer is called repeatedly).
	/// <inheritdoc />
	public ValueTask DisposeAsync()
	{
		audioEngine.Dispose();
		speechRecognizer?.Dispose();
		liveSpeechRequest?.Dispose();
		recognitionTask?.Dispose();

		speechRecognizer = null;
		liveSpeechRequest = null;
		recognitionTask = null;
		return ValueTask.CompletedTask;

Comment on lines 35 to 44
public async Task StartListenAsync(SpeechToTextOptions options, CancellationToken cancellationToken = default)
{
cancellationToken.ThrowIfCancellationRequested();
if (CurrentState != SpeechToTextState.Stopped)
{
return;
}

await InternalStartListening(options, cancellationToken);
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This early-return re-entrancy guard is not thread-safe: two concurrent callers can both observe CurrentState == Stopped and proceed into InternalStartListening, potentially starting recognition twice. Consider protecting start/stop with a SemaphoreSlim/AsyncLock or an Interlocked state flag so the guard is atomic.

Copilot uses AI. Check for mistakes.
Comment on lines 35 to 43
public async Task StartListenAsync(SpeechToTextOptions options, CancellationToken cancellationToken = default)
{
cancellationToken.ThrowIfCancellationRequested();

if (CurrentState != SpeechToTextState.Stopped)
{
return;
}

await InternalStartListeningAsync(options, cancellationToken).ConfigureAwait(false);
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This early-return re-entrancy guard is not thread-safe: two concurrent callers can both observe CurrentState == Stopped and proceed into InternalStartListeningAsync, potentially double-subscribing events / starting recognition twice. Consider protecting start/stop with a SemaphoreSlim/AsyncLock or an Interlocked state flag so the guard is atomic.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.

Comments suppressed due to low confidence (2)

src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/SharedSpeechToTextImplementation.macios.cs:33

  • silenceTimer is stopped/unsubscribed but never disposed (and isn’t cleared). Since this timer is a field and can be recreated on subsequent starts, it should be disposed when stopping and also in DisposeAsync to prevent leaks and stray ticks.
	public ValueTask DisposeAsync()
	{
		audioEngine.Dispose();
		speechRecognizer?.Dispose();
		liveSpeechRequest?.Dispose();
		recognitionTask?.Dispose();

		speechRecognizer = null;
		liveSpeechRequest = null;
		recognitionTask = null;
		return ValueTask.CompletedTask;

src/CommunityToolkit.Maui.Core/Essentials/SpeechToText/OfflineSharedSpeechToTextImplementation.macios.cs:33

  • silenceTimer is stopped/unsubscribed but never disposed (and isn’t cleared). Since this timer is held as a field and can be recreated on subsequent starts, it should be disposed when stopping and also in DisposeAsync to prevent leaks and potential stray ticks.
	public ValueTask DisposeAsync()
	{
		audioEngine.Dispose();
		speechRecognizer?.Dispose();
		liveSpeechRequest?.Dispose();
		recognitionTask?.Dispose();

		speechRecognizer = null;
		liveSpeechRequest = null;
		recognitionTask = null;
		return ValueTask.CompletedTask;
	}

Comment on lines +86 to +89
void OnSilenceTimerTick(object? sender, EventArgs e)
{
StopRecording();
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StopRecording() can be invoked both from the speech recognizer callback (in CreateSpeechRecognizerTask) and from OnSilenceTimerTick. Without an idempotency/thread-safety guard, it can run concurrently or multiple times (disposing the same objects / removing taps twice). Consider adding a re-entrancy guard (e.g., Interlocked.Exchange on a "stopping" flag) and/or ensuring the stop logic runs on a single thread/dispatcher.

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +81
void OnSilenceTimerTick(object? sender, EventArgs e)
{
InternalStopListening();
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InternalStopListening() can be invoked both from the recognition task callback and from OnSilenceTimerTick. Without an idempotency/thread-safety guard, it can run multiple times or concurrently (disposing the same objects / removing taps twice). Consider adding a re-entrancy guard (e.g., Interlocked.Exchange on a "stopping" flag) and/or serializing the stop logic onto a single dispatcher thread.

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +41
if (CurrentState != SpeechToTextState.Stopped)
{
return;
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StartListenAsync uses a non-atomic CurrentState check to prevent re-entry. If multiple callers invoke this concurrently, both can observe Stopped and start listening in parallel. Consider using a thread-safe gate (e.g., SemaphoreSlim/Interlocked) around the start path to make this re-entrancy guard reliable.

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +24
/// <summary>
/// The duration of continuous silence after which speech recognition will automatically stop.
/// Use <see cref="TimeSpan.MaxValue"/> (the default) to indicate that auto-stop based on silence is disabled.
/// </summary>
public TimeSpan AutoStopSilenceTimeout { get; init; } = TimeSpan.MaxValue;
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoStopSilenceTimeout accepts any TimeSpan, but platform implementations handle non-positive values inconsistently (Android/iOS/macOS ignore <= 0, while Windows applies the value directly). Consider validating this option (e.g., throw ArgumentOutOfRangeException unless > TimeSpan.Zero or TimeSpan.MaxValue) or normalizing it to a consistent cross-platform behavior.

Copilot uses AI. Check for mistakes.
@@ -1,11 +1,14 @@
using AVFoundation;
using CoreFoundation;
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoreFoundation is imported but not used in this file. Consider removing it to avoid unnecessary dependencies and keep the file warning-free.

Suggested change
using CoreFoundation;

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 24, 2026 05:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Comment on lines +93 to +119
return sfSpeechRecognizer.GetRecognitionTask(sfSpeechAudioBufferRecognitionRequest, (result, err) =>
{
if (err is not null)
{
currentIndex = 0;
StopRecording();
OnRecognitionResultCompleted(SpeechToTextResult.Failed(new Exception(err.LocalizedDescription)));
}
else
{
if (result.Final)
{
currentIndex = 0;
StopRecording();
OnRecognitionResultCompleted(SpeechToTextResult.Success(result.BestTranscription.FormattedString));
}
else
{
RestartTimer();
if (currentIndex <= 0)
{
OnSpeechToTextStateChanged(CurrentState);
}

currentIndex++;
OnRecognitionResultUpdated(result.BestTranscription.FormattedString);
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The currentIndex variable is incremented on each partial result but is never used after incrementing. Previously, currentIndex tracked position in the segments array to report only new segments. Now it only serves to detect the first partial result (currentIndex <= 0). This means the variable name is misleading and the increment serves no purpose. Consider renaming to 'isFirstPartialResult' as a boolean or removing the variable entirely if only the first-update detection is needed.

Copilot uses AI. Check for mistakes.
Comment on lines +85 to +111
int currentIndex = 0;
return sfSpeechRecognizer.GetRecognitionTask(sfSpeechAudioBufferRecognitionRequest, (result, err) =>
{
if (err is not null)
{
currentIndex = 0;
InternalStopListening();
OnRecognitionResultCompleted(SpeechToTextResult.Failed(new Exception(err.LocalizedDescription)));
}
else
{
if (result.Final)
{
currentIndex = 0;
InternalStopListening();
OnRecognitionResultCompleted(SpeechToTextResult.Success(result.BestTranscription.FormattedString));
}
else
{
RestartTimer();
if (currentIndex <= 0)
{
OnSpeechToTextStateChanged(CurrentState);
}

currentIndex++;
OnRecognitionResultUpdated(result.BestTranscription.FormattedString);
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The currentIndex variable is incremented on each partial result but is never used after incrementing. Previously, currentIndex tracked position in the segments array to report only new segments. Now it only serves to detect the first partial result (currentIndex <= 0). This means the variable name is misleading and the increment serves no purpose. Consider renaming to 'isFirstPartialResult' as a boolean or removing the variable entirely if only the first-update detection is needed.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@TheCodeTraveler TheCodeTraveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vlad!

Let's fix a couple race conditions and add a bounds-check to AutoStopSilenceTimeout (noted in comments below).

I also found a few problems with the sample app that made this difficult to test. It'd be great if you could fix these in the PR as well! And afterwards, could you double-check the docs to see if we should provide more guidance to developers based on your lessons learned updating the sample?

Sample App Problems

iOS Simulator

When I tap StartListenAsync on the OfflineSpeechToText page in CommunityToolkit.Maui.Sample on my iOS simulator the app crashes:

ObjCRuntime.ObjCException: Objective-C exception thrown. Name: com.apple.coreaudio.avfaudio Reason: required condition is false: nullptr == Tap()

I'm not sure exactly what's causing this.

iOS Device

When on a physical iOS device, when I tap StartListenAsync on the OfflineSpeechToText page in CommunityToolkit.Maui.Sample, the app freezes and I can no longer tap StopListenAsync. My guess is that we're over-using the UI thread somewhere.

Android Emulator

When on an Android Emulator, when I tap StartListenAsync on the OfflineSpeechToText page in CommunityToolkit.Maui.Sample, I get the system notification that the microphone is being used, however the State label text never changes and the Language Output label text never displays my text.

Android Device

When on an Android Emulator, when I tap StartListenAsync on the OfflineSpeechToText page in CommunityToolkit.Maui.Sample, the State label text does change and the Language Output label text does display my text.

However, I noticed that the State label text doesn't change to Listening until I actually start speaking. We should be updating the CurrentState property to Listening as soon as we activate the microphone.

Windows

On Windows, when I tap StartListenAsync on the OfflineSpeechToText page in CommunityToolkit.Maui.Sample the app crashes:

System.IO.FileNotFoundException: 'Could not find file 'C:\GitHub\CommunityToolkit.Maui\samples\CommunityToolkit.Maui.Sample\bin\Debug\net10.0-windows10.0.19041.0\win-arm64\AppxManifest.xml'.'

recognitionTask?.Cancel();
recognitionTask?.Finish();
audioEngine.Stop();
audioEngine.InputNode.RemoveTapOnBus(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concerns here:

  1. Race Condition between the developer calling StopListening() and the timer calling OnSilenceTimerTick
  2. Calling audioEngine.InputNode.RemoveTapOnBus(0) after it has already been called

For the race condition, we can add a new field, readonly Lock stopListeningLock = new() wrap the entire method in a Lock:

void InternalStopListening()
{
    lock(stopListeningLock)
    {

    }
}

For RemoveTapOnBus(0), I'm not an expert here. Do bad things happen when we call this after it has previously been called?

We could always check first to see if the AudioEngine is Running before executing this code:

if (audioEngine.Running)
{
	audioEngine.Stop();
	audioEngine.InputNode.RemoveTapOnBus(0);
}

/// The duration of continuous silence after which speech recognition will automatically stop.
/// Use <see cref="TimeSpan.MaxValue"/> (the default) to indicate that auto-stop based on silence is disabled.
/// </summary>
public TimeSpan AutoStopSilenceTimeout { get; init; } = TimeSpan.MaxValue;
Copy link
Collaborator

@TheCodeTraveler TheCodeTraveler Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a bounds-check here and throw an exception when a developer passes in a negative or zero value:

Suggested change
public TimeSpan AutoStopSilenceTimeout { get; init; } = TimeSpan.MaxValue;
public TimeSpan AutoStopSilenceTimeout
{
get;
init
{
ArgumentOutOfRangeException.ThrowIfNegativeOrZero(value.TotalMilliseconds);
field = value;
}
} = TimeSpan.MaxValue;

intent.PutExtra(RecognizerIntent.ExtraLanguagePreference, javaLocale);
intent.PutExtra(RecognizerIntent.ExtraOnlyReturnLanguagePreference, javaLocale);

if (options.AutoStopSilenceTimeout < TimeSpan.MaxValue && options.AutoStopSilenceTimeout > TimeSpan.Zero)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we add the bounds-check to SpeechToTextOptions.AutoStopSilenceTimeout, we can update this if statement:

Suggested change
if (options.AutoStopSilenceTimeout < TimeSpan.MaxValue && options.AutoStopSilenceTimeout > TimeSpan.Zero)
if (options.AutoStopSilenceTimeout < TimeSpan.MaxValue)


audioEngine = new AVAudioEngine
{
AutoShutdownEnabled = false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we no longer need AutoShutdownEnabled = false on MacCatalyst now that we're adding SpeechToTextOptions.AutoStopSilenceTimeout?

@@ -35,6 +35,11 @@ public event EventHandler<SpeechToTextStateChangedEventArgs> StateChanged
public async Task StartListenAsync(SpeechToTextOptions options, CancellationToken cancellationToken = default)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As CoPilot pointed out, we have a potential race condition here.

Let's add a SemaphoreSlim to ensure that only one thread is executing this method at a time:

public sealed partial class OfflineSpeechToTextImplementation : ISpeechToText
{
    readonly SemaphoreSlim startListeningSemaphoreSlim = new(1, 1);

	public async Task StartListenAsync(SpeechToTextOptions options, CancellationToken cancellationToken = default)
	{
		cancellationToken.ThrowIfCancellationRequested();

        await startListeningSemaphoreSlim.WaitAsync(cancellationToken);

        try
        {
    		if (CurrentState is not SpeechToTextState.Stopped)
    		{
    			return;
    		}
    		
    		await InternalStartListening(options, cancellationToken);
         }
        finally
        {
            startListeningSemaphoreSlim.Release();
        }
	}
}

recognitionTask?.Cancel();
recognitionTask?.Finish();
audioEngine.Stop();
audioEngine.InputNode.RemoveTapOnBus(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concerns here:

  1. Race Condition between the developer calling StopListening() and the timer calling OnSilenceTimerTick
  2. Calling audioEngine.InputNode.RemoveTapOnBus(0) after it has already been called

For the race condition, we can add a new field, readonly Lock stopListeningLock = new() wrap the entire method in a Lock:

void InternalStopListening()
{
    lock(stopListeningLock)
    {

    }
}

For RemoveTapOnBus(0), I'm not an expert here. Do bad things happen when we call this after it has previously been called?

We could always check first to see if the AudioEngine is Running before executing this code:

if (audioEngine.Running)
{
	audioEngine.Stop();
	audioEngine.InputNode.RemoveTapOnBus(0);
}

intent.PutExtra(RecognizerIntent.ExtraLanguagePreference, javaLocale);
intent.PutExtra(RecognizerIntent.ExtraOnlyReturnLanguagePreference, javaLocale);

if (options.AutoStopSilenceTimeout < TimeSpan.MaxValue && options.AutoStopSilenceTimeout > TimeSpan.Zero)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we add the bounds-check to SpeechToTextOptions.AutoStopSilenceTimeout, we can update this if statement:

Suggested change
if (options.AutoStopSilenceTimeout < TimeSpan.MaxValue && options.AutoStopSilenceTimeout > TimeSpan.Zero)
if (options.AutoStopSilenceTimeout < TimeSpan.MaxValue)

@@ -35,7 +35,11 @@ public event EventHandler<SpeechToTextStateChangedEventArgs> StateChanged
public async Task StartListenAsync(SpeechToTextOptions options, CancellationToken cancellationToken = default)
Copy link
Collaborator

@TheCodeTraveler TheCodeTraveler Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As CoPilot pointed out, we have a potential race condition here.

Let's add a SemaphoreSlim to ensure that only one thread is executing this method at a time:

public sealed partial class SpeechToTextImplementation : ISpeechToText
{
    readonly SemaphoreSlim startListeningSemaphoreSlim = new(1, 1);

	public async Task StartListenAsync(SpeechToTextOptions options, CancellationToken cancellationToken = default)
	{
		cancellationToken.ThrowIfCancellationRequested();

        await startListeningSemaphoreSlim.WaitAsync(cancellationToken);

        try
        {
    		if (CurrentState is not SpeechToTextState.Stopped)
    		{
    			return;
    		}
    		
    		await InternalStartListening(options, cancellationToken).ConfigureAwait(false);
         }
        finally
        {
            startListeningSemaphoreSlim.Release();
        }
	}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants