Grammar and Tone Checker

**Current file:** `app/src/lib/tools/grammar-checker.ts`  
**Current model:** `llama-3.3-70b`  
**Current approach:** Single prompt asking for corrections, changes list, tone adjustments, and tips. No programmatic grammar checking, no diff generation, no readability scoring.

**Problems with current approach:**
- No way to verify that corrections are actually correct (LLM may introduce new errors).
- The "changes made" list may not match the actual corrected version (inconsistency between sections).
- No quantitative readability metrics.
- Tone adjustment is subjective and not measured.

**Upgrade plan:**

| Step | Agent | Action |
|------|-------|--------|
| 1 | Pre-Analysis | Programmatic: Compute readability scores (Flesch-Kincaid, Gunning Fog). Count sentences, words, syllables. Detect language. Run `language_tool_python` for baseline grammar checks. |
| 2 | Correction Agent | Receive the original text, programmatic grammar findings, and target tone. Generate the corrected version with tone adjustments. |
| 3 | Diff Generator | Programmatic: Compute a word-level diff between original and corrected text. Auto-generate an accurate "changes made" list from the diff, not from LLM memory. |
| 4 | Post-Analysis | Programmatic: Re-compute readability scores on the corrected text. Show before/after comparison. |

- You are free to enhance the agents stacks in the above plan layout, the above one is just for reference. You can enhance more if needed.

**Model suggestions to start with:**
- Step 2: Try `llama-3.3-70b` (current model, good at natural language). Also try `qwen-3-32b` or `gpt-oss-120b`. For formal/technical writing, try `deepseek-r1-0528`.

**Model Selection Guidance**
- **You are free to pick any model from the Oxlo catalog** based on your own testing and evaluation.
- The Models suggestions above, not mandates. Try them first, and if they do not meet the accuracy target, experiment with alternatives.

**Compare against:** GPT 5.3 Thinking (strong at grammar and tone).

**Acceptance criteria:**
- Changes list must be auto-generated from actual diff (no hallucinated changes).
- Readability scores are computed programmatically (Flesch-Kincaid included).
- Corrected text does not introduce new grammatical errors (verified by re-running grammar checker).
- Overall quality matches or exceeds GPT 5.3 Thinking on 20 test cases.
- Overall accuracy at 80%+.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grammar and Tone Checker #24

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Step	Agent	Action
1	Pre-Analysis	Programmatic: Compute readability scores (Flesch-Kincaid, Gunning Fog). Count sentences, words, syllables. Detect language. Run `language_tool_python` for baseline grammar checks.
2	Correction Agent	Receive the original text, programmatic grammar findings, and target tone. Generate the corrected version with tone adjustments.
3	Diff Generator	Programmatic: Compute a word-level diff between original and corrected text. Auto-generate an accurate "changes made" list from the diff, not from LLM memory.
4	Post-Analysis	Programmatic: Re-compute readability scores on the corrected text. Show before/after comparison.

Grammar and Tone Checker #24

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions