Skip to content

Fix: Add missing Tokenizer and TokenizationError imports#77

Open
ada-cinar wants to merge 2 commits into
mainfrom
feature/76-fix-tokenizer-import
Open

Fix: Add missing Tokenizer and TokenizationError imports#77
ada-cinar wants to merge 2 commits into
mainfrom
feature/76-fix-tokenizer-import

Conversation

@ada-cinar

Copy link
Copy Markdown
Member

Summary

Fixes #76

Added missing and imports to .

Changes

  • ✅ Added to import statement from module
  • ✅ Added to import statement from module
  • ✅ Verified imports work correctly with PYTHONPATH test

Impact

  • Fixes broken public API contract
  • Enables proper IDE autocomplete and type hints
  • Classes were declared in but not actually imported

Testing

Type

  • Bug fix (non-breaking change)
  • Enhancement
  • Breaking change

Closes #76

- Add LEMMA_DICT_DATA static from resources/tr/lemmas/turkish_lemma_dict.txt
- Update get_lemma_dict() to parse TSV format (inflected<TAB>lemma)
- Load 1,362+ inflected forms → lemmas at build time (zero-overhead)
- Add comprehensive tests for dictionary loading and lookups
- Update Python type stubs with accurate documentation

Coverage:
- High-frequency nouns with case/plural suffixes
- Common verb inflections (tense/aspect/person)
- Systematic vowel harmony patterns

Resolves #74
- Added Tokenizer and TokenizationError to __init__.py imports
- Fixes #76: Public API contract broken for these classes
- Classes were declared in __all__ but not imported from tokenizer module
- Enables proper IDE autocomplete and type hints
@ada-cinar ada-cinar added bug Something isn't working good first issue Good for newcomers core labels Jan 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working core good first issue Good for newcomers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Missing Import for Tokenizer and TokenizationError in __init__.py

1 participant