LightLemma

A lightweight, fast English lemmatizer and stemmer for high-performance text normalization.

Quick Start

pip install lightlemma

from lightlemma import lemmatize, stem, tokenize, text_to_lemmas

# Lemmatization (dictionary-based, linguistically accurate)
lemmatize("running")  # → "run"
lemmatize("better")   # → "good"

# Stemming (rule-based, faster)
stem("running")       # → "run"
stem("studies")       # → "studi"

# Tokenization
tokenize("Hello, world!")  # → ["hello", "world"]

# Complete text processing
text_to_lemmas("The cats are running")  # → ["the", "cat", "be", "run"]

Key Features

Fast lemmatization: Dictionary-based word normalization
Porter stemmer: Rule-based suffix removal
Flexible tokenization: Customizable text splitting
Zero dependencies: Lightweight and self-contained
Simple API: Easy integration into existing projects

Lemmatization vs Stemming

Feature	Lemmatization	Stemming
Output	Real words	May produce non-words
Method	Dictionary + morphology	Rule-based suffix removal
Speed	Moderate	Fast
Accuracy	High	Moderate
Example	"studies" → "study"	"studies" → "studi"

Documentation

For detailed usage examples, advanced features, and complete API documentation, see README_FULL.md.

License

MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
lightlemma		lightlemma
tests		tests
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_FULL.md		README_FULL.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LightLemma

Quick Start

Key Features

Lemmatization vs Stemming

Documentation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

xga0/lightlemma

Folders and files

Latest commit

History

Repository files navigation

LightLemma

Quick Start

Key Features

Lemmatization vs Stemming

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages