Reproductions wanted — start here

This repository is a **standing instrument**, not a one-shot publication. The whole design assumes external scrutiny lands here as tracked issues. If you have engaged with the study at any depth, this issue is the right place to land the result.

## What kinds of contributions we're hoping to see

**1. Reproduction reports.** Re-run the prompt rung (`scripts/run_study.py` + `scripts/score.py`) and report numbers. The protocol is one OpenRouter key + Python 3.11+ + a few hours of API time. If you reproduce, please file a [reproduction-report](https://github.com/gorrie/bias-study/issues/new?template=reproduction-report.md) issue with your per-model deltas — agreement is interesting, divergence is *more* interesting. The committed cross-method agreement matrix is the structural reproducibility check.

**2. Model requests.** New frontier model dropped, or a model that wasn't in the v2 cross-section? File a [new-model-request](https://github.com/gorrie/bias-study/issues/new?template=new-model-request.md). The matrix gets added on the next quarterly cadence run.

**3. Findings.** You observed something in deployment that aligns or conflicts with this study? File a [finding-submission](https://github.com/gorrie/bias-study/issues/new?template=finding-submission.md). Anecdotes welcome; we triage them against the regression criteria.

**4. Methodology objections.** [`ADVERSARIAL-REVIEW.md`](https://github.com/gorrie/bias-study/blob/main/ADVERSARIAL-REVIEW.md) tracks every strong objection raised against the protocol so far, each marked FIXED / ANSWERED / TESTED / OPEN. If yours isn't in there, file a [bug-report](https://github.com/gorrie/bias-study/issues/new?template=bug-report.md) (we use that template for methodology objections too). Strong objections land as tracked items and either get FIXED with a re-run or get rebutted in writing. None of them get ignored.

## What we're explicitly *not* asking for

- Drive-by accusations of partisan motive. The voice of the writeup is the voice of the data; if the data is wrong, the data argument is the place to land it.
- Demands for paid expert review. LLM-driven methodology is the constraint the study is solving within, not bypassing (see §5.8 of the writeup + Adversarial E6).
- Requests to suppress findings. Methodology integrity > headline-result preservation; if your reproduction contradicts ours, that result publishes.

## How to start

The cleanest one-pass reproduction is the prompt rung against the main study (`data/2026-05-25-full`):

```bash
git clone https://github.com/gorrie/bias-study
cd bias-study
python -m pip install -r requirements.txt
cp .env.example .env  # fill in OPENROUTER_API_KEY
python scripts/sweep_status.py                 # ground-truth state check
python scripts/cross_method_report.py --all-runs
```

Estimated cost: four to ten dollars in OpenRouter calls. Output: per-model deltas you can diff directly against the committed JSON in `data/2026-05-25-full/cross-method/`.

The weight rung needs a 24 GB CUDA GPU or an Apple-Silicon 32 GB+ box. The pipeline rung needs a local G0DM0D3 server. Neither is required for a basic reproduction — the prompt rung is the load-bearing surface.

---

This issue stays pinned so the contribution path is visible from the issues tab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproductions wanted — start here #1

What kinds of contributions we're hoping to see

What we're explicitly not asking for

How to start

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reproductions wanted — start here #1

Description

What kinds of contributions we're hoping to see

What we're explicitly not asking for

How to start

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions