Gosplan

A research record of a seven-month attempt to build a machine-learning system that picks which cryptocurrency markets to hold in order to capture upward price moves — fifteen architecture versions, ~170 individual experiments, November 2025 through May 2026.

License: MIT · Citation: CITATION.cff · Full research record: docs/RESEARCH_MANIFEST.md

What Gosplan was trying to do

The founding idea: treat the market as efficient — prices already reflect available information, so future direction cannot be predicted from price alone — and ask what is still possible under that assumption. The answer the project pursued: even if you cannot predict any single market, at every moment some market in a large basket is trending up; a model that reliably rotates into whichever market is currently trending could capture those moves.

Concretely, Gosplan is posed as a selection problem:

Take N cryptocurrency markets (N ranged from 5 to 40+ across the project).
Represent each as a window of recent price candles plus a few technical indicators, normalised so all markets are on the same scale.
Stack them into one tensor and ask a neural network: which single market should I hold right now? — or hold cash.
Step forward one hour at a time, re-deciding each step, and measure the resulting return against simply buying and holding the basket ("excess return").

The recurring innovation that made this trainable was a synthetic data generator: because markets are assumed efficient and interchangeable, any random assembly of real market segments — random count, random symbols, random time windows — is a valid training example. This yields effectively unlimited training data (~10³⁹ unique samples from ~150 markets choosing 30–40).

The hard part, revisited every single month of the project, was never the network. It was producing training labels — "this was the right market to hold here" — without secretly using information from the future.

The honest summary

The project's terminal finding, recorded in detail in docs/RESEARCH_MANIFEST.md, is that no architectural variant tested survived forward-time out-of-sample validation. This repository is published as a reproducibility study with negative results. The path, the dead-ends, and the quantified out-of-sample collapses are the contribution.

Throughout the record, the project produced striking in-sample numbers — backtest returns in the hundreds or thousands of percent. Almost none of them survived honest out-of-sample testing. That gap is itself the central finding of the program and is treated as a first-class result, not an embarrassment to be smoothed over. The single cross-cutting lesson — distilled into a five-point checklist in docs/RESEARCH_MANIFEST.md §7 — is that across every approach tried, the model and the representation were never the alpha; only label quality, pooling, low capacity, honest validation, and using ML for risk/timing/sizing ever moved out-of-sample performance.

If you are looking for a working alpha strategy, this is not it. If you are looking for a documented record of what was tried, what happened, and why each apparent breakthrough failed when honestly tested, read on.

Get the data

All experiments use Kraken's freely available historical OHLCVT (Open / High / Low / Close / Volume / Trades) data. Download the complete zip from Kraken's support article:

https://support.kraken.com/articles/360047124832-downloadable-historical-ohlcvt-open-high-low-close-volume-trades-data

Unzip it locally. Each experiment in experiments/ will tell you which CSV files to place where to reproduce its results. The repository itself does not ship any market data, model checkpoints, or trained weights — only code, configuration, results manifests, and the research write-up.

Repository layout

Path	Contents
docs/RESEARCH_MANIFEST.md	The canonical research record. Twelve-stage chronology (Nov 2025 → May 2026), parallel research tracks, the OOS-survivability framework (§7), comparison with the quant-finance literature (§8), the recurring lessons, and the reading list. Start here.
docs/monthly-notes/	The seven monthly R&D lab notebooks transcribed slide-by-slide — the primary-source detail behind the manifest.
docs/migration_record.md	Historical record of the May 2026 experiments-tree migration.
src/gosplan/	Shared utility library — Kraken CSV loader and causal feature engineering primitives used across experiments.
experiments/	One self-contained directory per experiment (33 total): code, results, notes, and a 2 000-word density-matched README per experiment. `experiments/README.md` is the index.
tests/	Invariant tests for the shared library — most importantly, the causal-normalisation no-leak test.

Getting started

git clone https://github.com/rayarbro1/gosplan.git
cd gosplan
pip install -e ".[dev]"
make test

Then download the Kraken data (link above) and read docs/RESEARCH_MANIFEST.md for the overall arc, or jump into a specific experiment directory.

Causal-trading invariant

Every component in this repository that makes a trading decision must be strictly causal — it may use only past and present data, never future. This invariant is enforced by the test in tests/unit/test_features_causality.py, which asserts that perturbing future values cannot change current normalised feature values. Most of the project's documented failures trace to violations of this invariant; please preserve it.

Citation

See CITATION.cff. GitHub's "Cite this repository" button will render the metadata. If you cite the work, please also reference docs/RESEARCH_MANIFEST.md.

Contributing

The repository is primarily a frozen research record, but bug reports, reproduction confirmations, and pull requests that improve documentation or strengthen the test suite are welcome. See CONTRIBUTING.md and CODE_OF_CONDUCT.md. Please open an issue first for anything structural.

Acknowledgements

Thanks to Austin Macdonald for the conversations that shaped how this project was opened up — what to include, what to cut, and how to frame seven months of negative-results research for a public audience. The rebuild itself was orchestrated against the STAMPED principles for reproducible research objects (Self-contained · Tracked · Actionable · Modular · Portable · Ephemeral · Distributable), drawn from the stamped-principles/stamped-paper manuscript, with the structural migration executed using Claude Code.

Name		Name	Last commit message	Last commit date
Latest commit History 414 Commits
.github		.github
docs		docs
experiments		experiments
src/gosplan		src/gosplan
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gosplan

What Gosplan was trying to do

The honest summary

Get the data

Repository layout

Getting started

Causal-trading invariant

Citation

Contributing

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gosplan

What Gosplan was trying to do

The honest summary

Get the data

Repository layout

Getting started

Causal-trading invariant

Citation

Contributing

Acknowledgements

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages