Check Arabic audio quality before shipping. Silence gaps, clipping, noise, sample rate — all caught.
Arabic TTS is booming — 13+ models competing on the Arabic TTS Arena. But nobody checks the output quality automatically. Models ship audio with silence gaps, clipping, low volume, background noise, and wrong sample rates. Every team builds ad-hoc scripts to catch the same problems.
samt analyzes WAV files for silence gaps, clipping, low volume, noise floor, and format issues. It gives you a quality score (0-100) per file, flags specific problems, and works across entire directories.
Offline. No API. Pure Python stdlib.
pip install samt# Check a single file
samt check recording.wav
# Scan an entire directory
samt scan audio/
# Get file info (duration, sample rate, channels, bit depth)
samt info file.wav
# Compare two TTS outputs side by side
samt compare tts1.wav tts2.wav| Command | Description |
|---|---|
check |
Run all 20 checks on a WAV file. Shows quality score and flagged issues. |
scan |
Recursively scan a directory. Summary table with per-file scores. |
info |
Display audio metadata — duration, sample rate, channels, bit depth, file size. |
compare |
Compare two audio files side by side. Highlights differences in quality metrics. |
explain |
Show what each check does, thresholds, and scoring weights. |
20 checks across 5 categories:
| Check | What it catches |
|---|---|
| Leading silence | Dead air at the start of the file |
| Trailing silence | Dead air at the end of the file |
| Internal gaps | Long silence gaps mid-utterance |
| Silence ratio | Too much total silence relative to audio length |
| Check | What it catches |
|---|---|
| Clipping | Samples hitting max amplitude — distorted peaks |
| Low volume | Audio too quiet to be usable |
| Noise floor | Background noise level too high |
| Dynamic range | Audio too compressed or too dynamic |
| Check | What it catches |
|---|---|
| Sample rate | Non-standard rates (expects 16kHz, 22.05kHz, 44.1kHz, or 48kHz) |
| Bit depth | Unusual bit depths (expects 16-bit or 24-bit PCM) |
| Check | What it catches |
|---|---|
| Duration range | Arabic utterances too short (<0.5s) or too long (>30s) for TTS |
| Timing density | Speech-to-silence ratio outside normal Arabic speech patterns |
| Check | What it catches |
|---|---|
| Click detection | Pops and clicks from bad edits or recording artifacts |
| DC offset | Waveform not centered on zero — causes downstream issues |
Every file gets a score from 0 to 100:
| Score | Rating | Meaning |
|---|---|---|
| 90-100 | Excellent | Ship it |
| 75-89 | Good | Minor issues, likely fine |
| 50-74 | Fair | Review before shipping |
| 25-49 | Poor | Significant quality problems |
| 0-24 | Bad | Do not ship |
samt uses only Python standard library modules:
wave— read WAV headers and PCM datastruct— unpack raw audio samples
No librosa. No numpy. No scipy. No soundfile. Install and run anywhere Python runs.
مقدمة من مجتمع الذكاء الاصطناعي السعودي للعرب أولا وللعالم أجمع
Brought to you by the Saudi AI Community — for Arabs first, and the world at large.
MIT — Musa the Carpenter
The Series: artok · bidi-guard · arabench · majal · khalas · safha · raqeeb · sarih · qalam · naql · samt