Regex Explainer

**Current file:** `app/src/lib/tools/regex-explainer.ts`  
**Current model:** `deepseek-v3.2`  
**Current approach:** Single prompt asking for explanation, component breakdown, examples, pitfalls, and optimization. No programmatic regex parsing, no actual match testing.

**Problems with current approach:**
- Example strings (what matches and what does not) are not verified - LLM may provide strings that do not actually match.
- Component breakdown may have incorrect descriptions for complex patterns (lookaheads, backreferences).
- No support for different regex flavors (Python, JavaScript).
- Optimization suggestions are not validated.

**Upgrade plan:**

| Step | Agent | Action |
|------|-------|--------|
| 1 | Regex Parser | Programmatic: Parse the regex using Python `re` module. Extract named groups, quantifiers, anchors, lookaheads, character classes. Generate a structured AST-like representation. |
| 2 | Match Tester | Programmatic: Generate candidate test strings and run them against the regex. Verify which strings match and which do not. This guarantees example accuracy. |
| 3 | Explanation Agent | Receive the parsed structure and verified test results. Generate a plain-English explanation, component breakdown table, pitfall analysis, and optimization suggestions. |
| 4 | Optimization Validator | Programmatic: If the LLM suggests an optimized regex, test it against the same corpus to verify functional equivalence. |

- You are free to enhance the agents stacks in the above plan layout, the above one is just for reference. You can enhance more if needed.

**Model suggestions to start with:**
- Step 3: Try `deepseek-v3.2` (current model, already decent at regex). Also try `deepseek-r1-0528` for complex patterns with lookaheads/backreferences.
- This tool benefits more from the programmatic steps than from model upgrades. Focus engineering effort on Steps 1, 2, and 4.

**Model Selection Guidance**
- **You are free to pick any model from the Oxlo catalog** based on your own testing and evaluation.
- The Models suggestions above, not mandates. Try them first, and if they do not meet the accuracy target, experiment with alternatives.

**Compare against:** GPT 5.3 Thinking & Claude Sonnet 4.6.

**Acceptance criteria:**
- All "match" and "no match" examples must be verified programmatically (zero false examples).
- Component breakdown must correctly describe every segment of the regex.
- Optimized regex (if suggested) must be functionally equivalent (verified by testing against 30+ strings).
- Overall accuracy at 80%+.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regex Explainer #22

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Step	Agent	Action
1	Regex Parser	Programmatic: Parse the regex using Python `re` module. Extract named groups, quantifiers, anchors, lookaheads, character classes. Generate a structured AST-like representation.
2	Match Tester	Programmatic: Generate candidate test strings and run them against the regex. Verify which strings match and which do not. This guarantees example accuracy.
3	Explanation Agent	Receive the parsed structure and verified test results. Generate a plain-English explanation, component breakdown table, pitfall analysis, and optimization suggestions.
4	Optimization Validator	Programmatic: If the LLM suggests an optimized regex, test it against the same corpus to verify functional equivalence.

Regex Explainer #22

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions