Skip to content

rayarbro1/gosplan

Gosplan

A research record of a seven-month attempt to build a machine-learning system that picks which cryptocurrency markets to hold in order to capture upward price moves — fifteen architecture versions, ~170 individual experiments, November 2025 through May 2026.

License: MIT · Citation: CITATION.cff · Full research record: docs/RESEARCH_MANIFEST.md


What Gosplan was trying to do

The founding idea: treat the market as efficient — prices already reflect available information, so future direction cannot be predicted from price alone — and ask what is still possible under that assumption. The answer the project pursued: even if you cannot predict any single market, at every moment some market in a large basket is trending up; a model that reliably rotates into whichever market is currently trending could capture those moves.

Concretely, Gosplan is posed as a selection problem:

  • Take N cryptocurrency markets (N ranged from 5 to 40+ across the project).
  • Represent each as a window of recent price candles plus a few technical indicators, normalised so all markets are on the same scale.
  • Stack them into one tensor and ask a neural network: which single market should I hold right now? — or hold cash.
  • Step forward one hour at a time, re-deciding each step, and measure the resulting return against simply buying and holding the basket ("excess return").

The recurring innovation that made this trainable was a synthetic data generator: because markets are assumed efficient and interchangeable, any random assembly of real market segments — random count, random symbols, random time windows — is a valid training example. This yields effectively unlimited training data (~10³⁹ unique samples from ~150 markets choosing 30–40).

The hard part, revisited every single month of the project, was never the network. It was producing training labels — "this was the right market to hold here" — without secretly using information from the future.

The honest summary

The project's terminal finding, recorded in detail in docs/RESEARCH_MANIFEST.md, is that no architectural variant tested survived forward-time out-of-sample validation. This repository is published as a reproducibility study with negative results. The path, the dead-ends, and the quantified out-of-sample collapses are the contribution.

Throughout the record, the project produced striking in-sample numbers — backtest returns in the hundreds or thousands of percent. Almost none of them survived honest out-of-sample testing. That gap is itself the central finding of the program and is treated as a first-class result, not an embarrassment to be smoothed over. The single cross-cutting lesson — distilled into a five-point checklist in docs/RESEARCH_MANIFEST.md §7 — is that across every approach tried, the model and the representation were never the alpha; only label quality, pooling, low capacity, honest validation, and using ML for risk/timing/sizing ever moved out-of-sample performance.

If you are looking for a working alpha strategy, this is not it. If you are looking for a documented record of what was tried, what happened, and why each apparent breakthrough failed when honestly tested, read on.

Get the data

All experiments use Kraken's freely available historical OHLCVT (Open / High / Low / Close / Volume / Trades) data. Download the complete zip from Kraken's support article:

https://support.kraken.com/articles/360047124832-downloadable-historical-ohlcvt-open-high-low-close-volume-trades-data

Unzip it locally. Each experiment in experiments/ will tell you which CSV files to place where to reproduce its results. The repository itself does not ship any market data, model checkpoints, or trained weights — only code, configuration, results manifests, and the research write-up.

Repository layout

Path Contents
docs/RESEARCH_MANIFEST.md The canonical research record. Twelve-stage chronology (Nov 2025 → May 2026), parallel research tracks, the OOS-survivability framework (§7), comparison with the quant-finance literature (§8), the recurring lessons, and the reading list. Start here.
docs/monthly-notes/ The seven monthly R&D lab notebooks transcribed slide-by-slide — the primary-source detail behind the manifest.
docs/migration_record.md Historical record of the May 2026 experiments-tree migration.
src/gosplan/ Shared utility library — Kraken CSV loader and causal feature engineering primitives used across experiments.
experiments/ One self-contained directory per experiment (33 total): code, results, notes, and a 2 000-word density-matched README per experiment. experiments/README.md is the index.
tests/ Invariant tests for the shared library — most importantly, the causal-normalisation no-leak test.

Getting started

git clone https://github.com/rayarbro1/gosplan.git
cd gosplan
pip install -e ".[dev]"
make test

Then download the Kraken data (link above) and read docs/RESEARCH_MANIFEST.md for the overall arc, or jump into a specific experiment directory.

Causal-trading invariant

Every component in this repository that makes a trading decision must be strictly causal — it may use only past and present data, never future. This invariant is enforced by the test in tests/unit/test_features_causality.py, which asserts that perturbing future values cannot change current normalised feature values. Most of the project's documented failures trace to violations of this invariant; please preserve it.

Citation

See CITATION.cff. GitHub's "Cite this repository" button will render the metadata. If you cite the work, please also reference docs/RESEARCH_MANIFEST.md.

Contributing

The repository is primarily a frozen research record, but bug reports, reproduction confirmations, and pull requests that improve documentation or strengthen the test suite are welcome. See CONTRIBUTING.md and CODE_OF_CONDUCT.md. Please open an issue first for anything structural.

Acknowledgements

Thanks to Austin Macdonald for the conversations that shaped how this project was opened up — what to include, what to cut, and how to frame seven months of negative-results research for a public audience. The rebuild itself was orchestrated against the STAMPED principles for reproducible research objects (Self-contained · Tracked · Actionable · Modular · Portable · Ephemeral · Distributable), drawn from the stamped-principles/stamped-paper manuscript, with the structural migration executed using Claude Code.

About

Frozen reproducibility study of a seven-month attempt at ML-driven cryptocurrency market selection — fifteen architecture versions, ~170 experiments, terminal finding that nothing survives OOS validation. MIT.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors