Test Drive

Test an idea before you trust it.

Test Drive is an agent skill for Claude and Codex that helps people test an idea, claim, or decision before trusting it. It turns plausible beliefs, strategies, prompts, skills, messages, artifacts, and AI outputs into low-risk trials that can create useful evidence.

It helps people move from judgment to evidence-seeking action without overcommitting too early.

Why this exists

AI can make almost any idea sound polished. That is useful, but dangerous when polish gets confused with proof.

Test Drive helps a human ask the next better question: what kind of evidence would actually make this more or less trustworthy? Sometimes the answer is a customer interview. Sometimes it is a dry run, a prompt eval, a small post, a cohort analysis, a regression, a connector, or another skill.

The point is not to prove the idea right. The point is to learn before committing.

What it does

Test Drive helps identify:

the idea, claim, decision, belief, or artifact being tested
the evidence type that matters most
the smallest credible test
the artifact needed to run the test
connectors, tools, data, or permissions required
support and weaken signals
what would change your mind
the learning loop after the test

Evidence types

Test Drive routes by evidence type instead of domain:

Human reaction: interviews, surveys, messages, posts, prototype feedback
Behavioral / data: cohort analysis, funnel analysis, segmentation, regression
Reasoning: assumption audit, adversarial review, counterexample search
Expert judgment: multi-lens review, scenario analysis, pre-mortem
Artifact performance: dry runs, eval cases, rubric scoring, edge cases
Operational feasibility: connector checks, permission scope, pilot workflows

Install for Claude

Download test-drive.skill from the latest release, then add it through your Skills settings.

Install for Codex

Ask Codex:

Install my Test Drive skill from https://github.com/glichtenthal/test-drive

Or install manually:

python3 ~/.codex/skills/.system/skill-installer/scripts/install-skill-from-github.py \
  --repo glichtenthal/test-drive \
  --path . \
  --name test-drive

Restart Codex after installation.

Manual install

The repository is the skill. Copy the complete folder into your agent's skills directory. For example, Codex discovers:

~/.codex/skills/test-drive/SKILL.md

Good first uses

Use Test Drive on this product idea before I build anything.

How could I test whether this positioning actually lands?

Test drive this prompt before I reuse it.

What data or connector would we need to validate this claim?

Create the smallest useful test for this strategy.

How it fits with other skills

The Briefing Room organizes messy context.
Ground Truth pressure-tests the thinking.
The Quorum deliberates consequential decisions.
Test Drive tests an idea, claim, or decision before you trust it.

Together they support a judgment workflow: make the mess legible, challenge the reasoning, deliberate the stakes, then create evidence.

More agent skills

The Briefing Room - turn messy context into a brief you can think with.
Ground Truth - calibrated honesty and anti-sycophancy for plans, decisions, reviews, and ideas.
The Quorum - a five-member expert council that pressure-tests consequential decisions from multiple angles.

Build the `.skill` yourself

The repo is the skill. To create a local installable .skill file, zip the folder from its parent directory while excluding evals and local machine files:

cd path/to/parent
zip -r test-drive.skill test-drive \
  -x 'test-drive/evals/*' \
  -x '*/.DS_Store' \
  -x '*/__pycache__/*'

The official release asset is already packaged for install: test-drive.skill.

Repo layout

test-drive/
├── SKILL.md
├── agents/
│   └── openai.yaml
├── assets/
│   └── social-preview.svg
├── evals/
│   └── evals.json
├── examples/
│   ├── README.md
│   ├── positioning-test.md
│   ├── data-analysis-test.md
│   └── skill-eval-test.md
├── README.md
├── CHANGELOG.md
└── LICENSE

Version history

See CHANGELOG.md.

License

MIT - see LICENSE. Use it, fork it, tune it for your own judgment workflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test Drive

Why this exists

What it does

Evidence types

Install for Claude

Install for Codex

Manual install

Good first uses

How it fits with other skills

More agent skills

Build the `.skill` yourself

Repo layout

Version history

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agents		agents
assets		assets
evals		evals
examples		examples
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md

Folders and files

Latest commit

History

Repository files navigation

Test Drive

Why this exists

What it does

Evidence types

Install for Claude

Install for Codex

Manual install

Good first uses

How it fits with other skills

More agent skills

Build the .skill yourself

Repo layout

Version history

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Build the `.skill` yourself

Packages