A CLI tool for testing translation memory (TM) systems by generating test data, creating TMX files, and calculating Levenshtein-based match scores.
Many machine translation tools (e.g. DeepL Translate) now support the integration of translation memories. If there is a match in the source text, the MT system will prefer the human-validated translation from the translation memory over the AI-generated version. This improves consistency and quality of the translation, while keeping the flexibility of AI translation when there's no match.
But how well does this work in practice? This tool lets you find out. Here's the basic flow:
- Generate text in English, translate it to German.
- Build a translation memory based on these texts and the TMX standard.
- Create a variant of the source text
- Upload the translation memory to your machine translation service. They all accept the TMX format. Example: DeepL lets you upload TMX files in the customization hub.
- Translate the source text variant with the MT service.
- Evaluate how well the TM integration works in the MT output. Compare with the matching report generated by this tool to see if matches were inserted as expected.
You will probably notice big differences, usually caused by segmentation misalignment. This tool allows you to experiment with different segmentation approaches to see what works better.
- Text Generation: Generate English source text about electric vehicles in cities using Claude API
- Smart Segmentation: Segment text into TM-style units (short or long mode)
- German Translation: Translate segments to casual, fluent German
- TMX Export: Export to TMX 1.4 format for import into OmegaT, DeepL, memoQ, etc.
- Variation Generation: Create test variations targeting specific match percentages (100%, 95-99%, 85-94%, 70-84%, 50-69%, 0-49%)
- Match Scoring: Calculate Levenshtein-based similarity scores
- Reports: Generate JSON and HTML reports with diff highlighting
- Python 3.10+
- UV package manager (recommended)
# Clone the repository
git clone https://github.com/rhofkens/translation-memory-tester.git
cd translation-memory-tester
# Create virtual environment and install
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"Set your Anthropic API key:
export ANTHROPIC_API_KEY='your-api-key'Run the complete pipeline with a single command:
tmtest run-all --output-dir ./my-testThis will:
- Generate ~500 words of English text about EVs
- Segment the text into TM units
- Translate to German (casual style)
- Export to TMX format
- Generate variations for different match levels
- Calculate match scores and generate reports
Run the complete pipeline.
tmtest run-all --output-dir ./output --verbose --segment-mode shortOptions:
--output-dir, -o: Output directory (default:./output)--verbose, -V: Show detailed output--segment-mode, -m:shortfor better TM reuse,longfor full sentences
Generate English source text.
tmtest generate --output source.jsonSegment source text into TM units.
tmtest segment source.json --output segments.json --mode shortOptions:
--mode, -m:short(splits at conjunctions) orlong(full sentences)
Translate segments to German.
tmtest translate segments.json --output translated.jsonValidate German translations for coherence.
tmtest validate translated.json
tmtest validate translated.json --fix --output fixed.jsonExport to TMX format.
tmtest export-tmx translated.json --output memory.tmxGenerate variations for different match percentages.
tmtest variate segments.json --output variations.jsonMatch variations against TM and generate reports.
# JSON report
tmtest match variations.json --tm memory.tmx --output report.json
# HTML report with diff highlighting
tmtest match variations.json --tm memory.tmx --output report.html --format html| File | Description |
|---|---|
source.json |
Generated English text with word count |
segments.json |
Segmented text with segment IDs |
translated.json |
Segments with German translations |
memory.tmx |
TMX file for import into TM systems |
variations.json |
Test variations tagged by intended match category |
report.json |
Match results with scores and statistics |
report.html |
Visual report with diff highlighting |
| Category | Score Range | Variation Strategy |
|---|---|---|
| Exact | 100% | Identical segments |
| Near-exact | 95-99% | Punctuation/capitalization changes |
| High Fuzzy | 85-94% | Synonym substitutions |
| Medium Fuzzy | 70-84% | Phrase-level changes |
| Low Fuzzy | 50-69% | Significant rewrites |
| No Match | 0-49% | New content |
The generated TMX files are compatible with:
- OmegaT
- DeepL Translation Memory
- memoQ
- SDL Trados
- Any TMX 1.4-compliant system
# Install dev dependencies
uv pip install -e ".[dev]"
# Run tests
pytest
# Format code
ruff format src/
ruff check src/ --fixMIT