Test an idea before you trust it.
Test Drive is an agent skill for Claude and Codex that helps people test an idea, claim, or decision before trusting it. It turns plausible beliefs, strategies, prompts, skills, messages, artifacts, and AI outputs into low-risk trials that can create useful evidence.
It helps people move from judgment to evidence-seeking action without overcommitting too early.
AI can make almost any idea sound polished. That is useful, but dangerous when polish gets confused with proof.
Test Drive helps a human ask the next better question: what kind of evidence would actually make this more or less trustworthy? Sometimes the answer is a customer interview. Sometimes it is a dry run, a prompt eval, a small post, a cohort analysis, a regression, a connector, or another skill.
The point is not to prove the idea right. The point is to learn before committing.
Test Drive helps identify:
- the idea, claim, decision, belief, or artifact being tested
- the evidence type that matters most
- the smallest credible test
- the artifact needed to run the test
- connectors, tools, data, or permissions required
- support and weaken signals
- what would change your mind
- the learning loop after the test
Test Drive routes by evidence type instead of domain:
- Human reaction: interviews, surveys, messages, posts, prototype feedback
- Behavioral / data: cohort analysis, funnel analysis, segmentation, regression
- Reasoning: assumption audit, adversarial review, counterexample search
- Expert judgment: multi-lens review, scenario analysis, pre-mortem
- Artifact performance: dry runs, eval cases, rubric scoring, edge cases
- Operational feasibility: connector checks, permission scope, pilot workflows
Download test-drive.skill from the latest release, then add it through your Skills settings.
Ask Codex:
Install my Test Drive skill from https://github.com/glichtenthal/test-drive
Or install manually:
python3 ~/.codex/skills/.system/skill-installer/scripts/install-skill-from-github.py \
--repo glichtenthal/test-drive \
--path . \
--name test-driveRestart Codex after installation.
The repository is the skill. Copy the complete folder into your agent's skills directory. For example, Codex discovers:
~/.codex/skills/test-drive/SKILL.md
Use Test Drive on this product idea before I build anything.
How could I test whether this positioning actually lands?
Test drive this prompt before I reuse it.
What data or connector would we need to validate this claim?
Create the smallest useful test for this strategy.
- The Briefing Room organizes messy context.
- Ground Truth pressure-tests the thinking.
- The Quorum deliberates consequential decisions.
- Test Drive tests an idea, claim, or decision before you trust it.
Together they support a judgment workflow: make the mess legible, challenge the reasoning, deliberate the stakes, then create evidence.
- The Briefing Room - turn messy context into a brief you can think with.
- Ground Truth - calibrated honesty and anti-sycophancy for plans, decisions, reviews, and ideas.
- The Quorum - a five-member expert council that pressure-tests consequential decisions from multiple angles.
The repo is the skill. To create a local installable .skill file, zip the folder from its parent directory while excluding evals and local machine files:
cd path/to/parent
zip -r test-drive.skill test-drive \
-x 'test-drive/evals/*' \
-x '*/.DS_Store' \
-x '*/__pycache__/*'The official release asset is already packaged for install: test-drive.skill.
test-drive/
├── SKILL.md
├── agents/
│ └── openai.yaml
├── assets/
│ └── social-preview.svg
├── evals/
│ └── evals.json
├── examples/
│ ├── README.md
│ ├── positioning-test.md
│ ├── data-analysis-test.md
│ └── skill-eval-test.md
├── README.md
├── CHANGELOG.md
└── LICENSE
See CHANGELOG.md.
MIT - see LICENSE. Use it, fork it, tune it for your own judgment workflows.