KittenTTS Web

Web and Node.js SDK for on-device KittenTTS speech synthesis.
Generate speech in browsers and Node.js without sending text to a cloud TTS API.

Developer preview. APIs may change between releases.

Browser apps use ONNX Runtime Web and browser storage. Node.js apps use ONNX Runtime Web with filesystem storage by default.

Browser ONNX Runtime wasm assets are loaded from the matching ONNX Runtime Web CDN by default. For production apps that need CDN independence or stricter supply-chain controls, self-host those ONNX Runtime assets and set ortWasmPath.

See It In Action

Web · Plain HTML example with local speech generation and playback

What Is KittenTTS Web?

KittenTTS Web lets you add local speech synthesis to browser and Node.js apps:

Text-to-speech - neural voice synthesis from plain text.
On-device inference - powered by KittenTTS and ONNX Runtime Web.
Private by default - no cloud TTS request after assets are available.
Offline-ready - download once into browser or filesystem cache, or provide preloaded model bytes.
App-friendly output - play audio directly, save WAV or MP3 data, stream longer text, or use generated word timings for read-aloud UI.

No cloud. No API key. No text leaving the device for speech generation.

The SDK sends anonymous generation analytics; see Getting started for details and opt-out.

SDK

Runtime	Status	Docs
Browser	Developer preview	Getting started
Node.js	Developer preview	Getting started
Plain HTML example	Supported	HTML example
Vite React example	Supported	Vite React example
Node Express example	Supported	Node Express example

Install:

npm install @kittentts/web

Quick Start

Install the SDK:

npm install @kittentts/web

Generate audio in memory:

import { KittenTTS } from '@kittentts/web';

const tts = await KittenTTS.create(
  {
    model: 'nano-int8',
  },
  (progress) => {
    console.log(`setup ${Math.round(progress * 100)}%`);
  },
);

const result = await tts.generate('Hello from KittenTTS on the web.');

console.log(result.sampleRate);
console.log(result.wavBase64());
console.log(await result.mp3Base64());

await tts.dispose();

Play audio in a browser:

import {
  KittenTTS,
  createBrowserAudioPlayer,
} from '@kittentts/web';

const tts = await KittenTTS.create({
  player: createBrowserAudioPlayer(),
});

await tts.speak('This voice is generated in the browser.');

Generate audio in Node.js:

import { writeFile } from 'node:fs/promises';
import { KittenTTS } from '@kittentts/web';

const tts = await KittenTTS.create({
  model: 'nano-int8',
});

const result = await tts.generate('Generated in Node.js.');
await writeFile('speech.wav', result.wavData());
await writeFile('speech.mp3', await result.mp3Data());
await tts.dispose();

Full getting started guide →

Browser Setup

Browser apps can use the SDK directly from a frontend bundle:

import {
  KittenTTS,
  createBrowserAudioPlayer,
} from '@kittentts/web';

const tts = await KittenTTS.create({
  player: createBrowserAudioPlayer(),
});

The SDK configures ONNX Runtime Web wasm assets automatically. Pass ortWasmPath when you need to self-host those files.

The plain HTML example can also be opened directly from the filesystem:

npm run example:html

Open http://127.0.0.1:5173, or open examples/html/index.html directly in a browser. Direct file:// usage falls back to in-memory asset storage, so a refresh may download model assets again.

Sample Apps

examples/html - static HTML, CSS, and JavaScript setup.
examples/vite-react - Vite React browser setup.
examples/vite-react-word-timings - word highlighting with generated timings.
examples/node-express - Node Express backend-to-browser playback.

Features

On-device TTS inference in browsers and Node.js.
Model download and cache with progress callbacks.
Offline assets for apps that cannot depend on a first-run download.
Playback helpers for browser audio and custom audio layers.
WAV and MP3 output from generated raw PCM samples.
Word timings for read-aloud highlighting.
Streaming generation for longer text.

Supported Models

Start with nano-int8 for the smallest download. Use larger models when quality matters more than size.

Model	ID	Parameters	Approx download	Use case
Nano int8	`'nano-int8'`	15M	25 MB	Smallest app/download size
Nano fp32	`'nano'`	15M	56 MB	Nano quality without quantization
Micro	`'micro'`	40M	41 MB	Better quality, still compact
Mini	`'mini'`	80M	80 MB	Highest quality option

Models and voices →

Voices

Bella, Jasper, Luna, Bruno, Rosie, Hugo, Kiki, Leo

await tts.speak('Luna speaking.', { voice: 'luna' });
await tts.speak('Slower Bruno speaking.', { voice: 'bruno', speed: 0.85 });

Docs

Examples:

System Requirements

Node.js 20+
Modern browser with WebAssembly support for browser apps
Network access to Hugging Face for first-run model downloads, unless assets are preloaded or self-hosted

Runtime dependencies installed by the SDK:

onnxruntime-web
pako

Audio playback is optional. Use createBrowserAudioPlayer() in browsers or pass a custom AudioPlayer. Use tts.createPlaybackQueue() when multiple generated clips should play in order instead of interrupting each other.

Roadmap

Add more streaming playback examples.
Add more browser storage and offline asset examples.
Continue tracking ONNX Runtime Web compatibility across browsers and Node.js.
Support future KittenTTS model releases as they become available.

Need something specific? Open an issue.

Community And Support

Website: kittenml.com
Repository: KittenML/KittenTTS-web
Discord: Join the community
Demo: Hugging Face Spaces
Issues: GitHub Issues
Commercial support: contact form

Commercial Support

Commercial support is available for teams integrating KittenTTS into their products, including integration assistance, custom voice development, and enterprise licensing.

Contact us or email info@stellonlabs.com to discuss your requirements.

License

Apache 2.0. See LICENSE.

Disclaimers

KittenTTS Web is a developer preview and APIs may change between releases. Generated speech quality, pronunciation, timing metadata, and playback behavior can vary by model, browser, device, and runtime. Review generated audio before using it in production workflows.

The SDK runs speech generation locally after assets are available. Anonymous generation analytics are enabled by default and can be disabled with analytics: false.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
docs		docs
examples		examples
scripts		scripts
src		src
tests		tests
vendor/cephonemizer		vendor/cephonemizer
.gitignore		.gitignore
.npmignore		.npmignore
AGENTS.md		AGENTS.md
API_GUIDELINES.md		API_GUIDELINES.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KittenTTS Web

See It In Action

What Is KittenTTS Web?

SDK

Quick Start

Browser Setup

Sample Apps

Features

Supported Models

Voices

Docs

System Requirements

Roadmap

Community And Support

Commercial Support

License

Disclaimers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

KittenTTS Web

See It In Action

What Is KittenTTS Web?

SDK

Quick Start

Browser Setup

Sample Apps

Features

Supported Models

Voices

Docs

System Requirements

Roadmap

Community And Support

Commercial Support

License

Disclaimers

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages