A deep learning pipeline for forecasting short-term stock movements using technical indicators, Monte Carlo dropout uncertainty, and out-of-sample backtesting.
Example output: probability-based strategy (neon green), SPY buy-and-hold (white), random baselines (faint white), and ±1σ / ±3σ uncertainty around the random mean (gray). Plots are generated by run_backtest_v4.py after forecasts exist in forecasts/.
This project provides an end-to-end workflow:
- Load and process per-ticker indicator CSVs (
TrainingData/processor.py, ticker list fromTrainingData/stockList.csv) - Train a Conv1D + LSTM model with MC dropout in
run_forecast_v4.ipynb - Write forecasts to
forecasts/and split metadata (split_info.json,oos_start_date.txt) for aligned backtests - Run
run_backtest_v4.pyto simulate trading on the test / OOS window only, writevideos/backtest_metrics.json, and export PNGs / MP4s tovideos/andoutput_plots/
The design emphasizes realistic evaluation (no training dates in the backtest), uncertainty-aware signals, and extension via TrainingData/featuresPy/. Optional raw feeds (insider, sentiment, Fear & Greed) are merged in TrainingData/processor.py when the corresponding files exist.
├── forecasts/ # *_forecast.csv outputs + split_info.json / oos_start_date.txt
├── cache/ # Cached preprocessed arrays
├── output_plots/ # Notebook + backtest exports (training curves, feature importance, extra PNGs/MP4s)
├── videos/ # Backtest PNGs, MP4s, and backtest_metrics.json from run_backtest_v4.py
├── repo_photos/ # Images for this README (commit your PNG here for GitHub)
├── test_ideas/ # Older v3 experiments (classification notebook + script); primary flow is v4
├── config.example.json # Copy to config.json and set ALPHA_VANTAGE_KEY if using downloader.py
├── TrainingData/
│ ├── indicators_data/
│ │ ├── raw/ # Raw inputs: prices, fear_greed.csv, insiderBuying/, sentiment/
│ │ └── processed/
│ │ ├── SPY-VIX/ # Benchmark series (e.g. SPY for backtest plot)
│ │ └── stocksData/ # One *_processed.csv per ticker
│ ├── featuresPy/ # Extra feature modules (e.g. fear_greed_correlation)
│ ├── stockList.csv # Tickers processed by processor.py
│ ├── processor.py # Main feature pipeline (merges optional raw feeds)
│ └── downloader.py # Optional data download helpers (Alpha Vantage)
├── run_forecast_v4.ipynb # Train / forecast pipeline (primary notebook)
├── run_backtest_v4.py # OOS backtest + metrics JSON + plots / videos
├── requirements.txt
└── README.md
close, YesterdayClose, YesterdayOpenLogR, YesterdayHighLogR, YesterdayLowLogR, YesterdayVolumeLogR, YesterdayCloseLogR, MA10, MA20, MA30, DayOfWeek, DayOfMonth, MonthNumber, EMA10, EMA30, RSI, MACD, MACD_Signal, BollingerUpper, BollingerLower, Volatility_10, Volatility_20, Volatility_30, OBV, ZScore, optional insider and sentiment columns, optional Fear & Greed (fear_greed) plus rolling fear_greed_correlation (stock vs index, from featuresPy/fear_greed_correlation.py when raw/fear_greed.csv is present), and gap / volatility / momentum / skew / intraday / sentiment-change fields. See processor.py for the authoritative column list.
| Feature | Description |
|---|---|
| Conv1D + LSTM | Captures local patterns and sequence structure in windowed inputs |
| Monte Carlo dropout | Uncertainty (std) around probability forecasts |
| Train / val / test split | Notebook writes OOS start; backtest uses test period only |
| Walk-forward–style windows | Rolling windows over processed history |
| Batch data generator | Efficient training from cached tensors |
| Confidence threshold | Configurable probability gate for “buy” signals (e.g. in notebook CONFIG) |
| Fear & Greed (optional) | Market index + rolling correlation feature when raw/fear_greed.csv is available |
pip install -r requirements.txtPlace processed stock CSVs under TrainingData/indicators_data/processed/stocksData/ (or run the optional download + processor steps below). Ensure TrainingData/stockList.csv lists the tickers you want.
Open and run run_forecast_v4.ipynb from the project root. It trains the model, writes forecasts under forecasts/, and can export plots to output_plots/.
The backtest reads forecasts/split_info.json (or forecasts/oos_start_date.txt / env BACKTEST_OOS_START) so it only trades on dates on or after the OOS start.
python run_backtest_v4.pyEnvironment: BACKTEST_OOS_START (optional) overrides the OOS start date; otherwise the script uses forecasts/split_info.json or forecasts/oos_start_date.txt from the notebook. Without any of these, the script exits so you do not accidentally backtest on training dates.
- Train the model in
run_forecast_v4.ipynband run the model export cell (writes tomodels/). - Run the live scoring:
python run_live_signals.py --min-accepted 0.2 --std-factor 0.0 --mc-samples 25Note: the min-accepted is the minimum accepted probability for the BUY to be accurate. Ex: min-accepted = 0.5, means that the lowest confidence in the purchased stock giving "above-normal" returns is 50%. The std-factor represents the multiplication factor on the uncertainty on the min-accepted factor. Once the model becomes more accurate, this std-factor matters more, since we want to be very certain about whatever the min accepted probability actually is.
Outputs
signals/live_scores_YYYY-MM-DD.csv— per-ticker live scores and status (buy_candidateorno_buy).signals/live_decision_YYYY-MM-DD.csv— one final recommendation row (BUYorNO_BUY).- If data is stale/missing, the script refreshes data by running
TrainingData/downloader.pyandTrainingData/processor.py.
After you add or change optional features in processed CSVs, re-run run_forecast_v4.ipynb so the model trains on the updated columns (see EXCLUDED_COLS / feature list in the notebook).
TrainingData/downloader.py runs the fetch scripts under TrainingData/featuresPy/ (e.g. stockScrapper.py, markets.py, insiderbuying.py, sentiment.py). Those scripts read config.json in the project root for ALPHA_VANTAGE_KEY. Copy config.example.json to config.json, add your key from alphavantage.co, then:
python TrainingData/downloader.pypython TrainingData/processor.pyFear & Greed (optional): Add TrainingData/indicators_data/raw/fear_greed.csv with at least date and fear_greed columns. Without it, the pipeline fills a neutral default for fear_greed and skips the correlation helper where not applicable.
To show the screenshot on GitHub, add your exported file as:
repo_photos/random_vs_prob_strategy_uncertainty.png
(e.g. copy from videos/random_vs_prob_strategy_uncertainty.png after a successful backtest).
