Skip to content

Conversation

@patrickkidd
Copy link
Owner

No description provided.

This framework enables automated testing of chatbot conversations using
LLM-generated synthetic users with specific behavioral traits.

Key components:
- personas.py: 5 user personas (evasive, oversharer, date_confused,
  emotionally_flooded, matter_of_fact) with detailed system prompts
- simulator.py: Conversation loop that alternates between real chatbot
  and synthetic user responses
- evaluators.py: Quality checks for robotic patterns (banned phrases,
  echoing, repetition, question balance, variety)
- conftest.py: pytest fixtures and markers (--synthetic, --synthetic-llm)
- test_quality_evaluators.py: 19 unit tests for evaluator logic
- test_synthetic_conversations.py: Integration tests with mocked and
  real LLM modes

Usage:
- Unit tests (no LLM): pytest btcopilot/tests/synthetic/test_quality_evaluators.py
- Mocked tests: pytest --synthetic btcopilot/tests/synthetic/
- Full LLM tests: pytest --synthetic --synthetic-llm btcopilot/tests/synthetic/
Adds a framework to evaluate whether conversations collect enough family
data as defined in the system prompts.

New files:
- btcopilot/personal/data_requirements.py: Single source of truth for data
  collection requirements (presenting problem, family of origin, extended
  family, own family, nodal events). Includes MINIMUM_COMPLETE_REQUIREMENTS
  defining when data collection is "done".
- btcopilot/tests/synthetic/data_completeness.py: Evaluator that checks
  conversations against requirements using heuristics or LLM analysis.
- btcopilot/tests/synthetic/test_data_completeness.py: 20 tests for
  requirements validation and completeness evaluation.

The data_requirements module can generate checklist markdown matching
the prompts, ensuring the prompt checklist and test criteria stay in sync.

Usage:
- Unit tests (heuristics): pytest btcopilot/tests/synthetic/test_data_completeness.py
- LLM tests: pytest --synthetic --synthetic-llm btcopilot/tests/synthetic/
@patrickkidd patrickkidd changed the title FD-300: Build synthetic user generator for testing FD-300-proto: Build synthetic user generator for testing Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants