Return flat mono audio arrays by dewana-sl · Pull Request #138 · KittenML/KittenTTS

dewana-sl · 2026-05-21T12:01:24Z

Summary

Returns generated mono audio as a flat samples array instead of preserving a singleton channel dimension from the ONNX output.

Why

Direct audio pipelines commonly expect mono audio as (samples,). With the previous (1, samples) shape, user code such as np.stack([audio, audio], axis=1) produces an unexpected 3D array rather than (samples, channels), which can lead to incorrect playback behavior in downstream audio tools.

Addresses #112.

Validation

python3 -m unittest -q
Editable install in a fresh virtualenv with declared package dependencies.
Import smoke for kittentts, normalize_text, and mono_audio_array.
Real inference smoke with KittenML/kitten-tts-nano-0.8-int8, asserting the returned audio is one-dimensional float32 audio.

Return flat mono audio arrays

1512775

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return flat mono audio arrays#138

Return flat mono audio arrays#138
dewana-sl wants to merge 1 commit into
KittenML:mainfrom
dewana-sl:return-mono-audio-arrays

dewana-sl commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dewana-sl commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dewana-sl commented May 21, 2026 •

edited

Loading