NFL ML Predictions

Full-stack NFL forecasting workspace with a FastAPI backend, a React/Vite frontend, a dataset build pipeline, and a model training pipeline.

Canonical Deploy Targets

Frontend: Vercel project nfl-ml-predictions
Production frontend alias: https://new-nfl-predict.vercel.app
Backend: Heroku app nfl-predict
Canonical backend origin: https://nfl-predict-ecf5a5bd34fe.herokuapp.com

Deploy intent:

Vercel should build from frontend/
Heroku should serve the FastAPI backend with the buildpack + Procfile flow
Production CORS should allow the canonical frontend origin plus .vercel.app previews

What This Repo Actually Does

Serves NFL schedule, health, status, prediction, and history endpoints from backend/main.py.
Stores user-scoped prediction history in SQLite first, with JSON files as a fallback.
Builds cleaned training datasets into backend/data/datasets/.
Trains score and win-probability models and promotes bundles for serving.
Ships a React app with a protected dashboard, history view, and status page.

Quick Start

1. Install dependencies

python -m pip install -r requirements.txt
cd frontend
npm install
cd ..

2. Build the canonical dataset

python backend/builddataset.py --start 2018 --end 2025 --out-dir backend/data/datasets

What this writes:

A dated run folder in backend/data/datasets/runs/<timestamp>/
A promoted clean CSV in backend/data/datasets/
backend/data/datasets/latest_dataset.json

3. Train models

python backend/train_models.py

What this writes by default:

Promoted artifacts in backend/models/
A staging bundle in backend/models/staging/<run_id>/
metadata.json, training_report.json, and run_summary.json
A dated mirror in backend/YYYYMMDD/models/ when training uses the default output directory

Important runtime note:

Training still writes to backend/models/ by default.
Serving prefers MODELS_DIR when set, then backend/data/models/current, then backend/data/models, then packaged fallbacks, and finally backend/models.
That split is intentional so deployments can serve a promoted bundle while local training experiments stay isolated.

4. Start the backend

uvicorn backend.main:app --reload --host 127.0.0.1 --port 8000

5. Start the frontend

cd frontend
npm run dev

Open http://localhost:3000.

Runtime Behavior That Matters

Prediction readiness is allowed to degrade

The backend now boots even if models are missing or incompatible.

/health, /status/models, /schedule, and /history still come up.
/predict returns 503 with structured blockers when the active bundle is not ready.
This makes deployments diagnosable instead of failing hard during startup.

Schedule loading is queryable and postseason-safe

GET /schedule?season=<year>&week=<week> returns a specific slate.
GET /schedule/next-week remains the compatibility route for "next slate".
When no future regular-season game exists, the backend falls back to the latest available slate, including postseason weeks.

History is user-scoped

The frontend sends X-User-Id, and the backend uses that to isolate prediction history.

Primary store: SQLite-backed history and summary metrics
Fallback: JSON ledgers under backend/Predictions/users/<user-storage-key>/

Frontend Architecture In One Minute

frontend/src/App.jsx creates the auth session and shared prediction state once.
frontend/src/hooks/usePredictionState.js owns schedule, health, history, summary, logos, and prediction maps.
frontend/src/components/DashBoard/Dashboard.jsx consumes that shared state instead of shadowing it locally.
frontend/src/api/client.js is the active transport and compatibility layer.
frontend/src/api/fetch.js is a legacy helper kept for older experiments and is not the main app path.

Key Endpoints

Health and status

GET /health
GET /status/overview
GET /status/models
GET /status/runtime

Schedule and prediction

GET /schedule
GET /schedule/next-week
GET /teams/logos
POST /predict
POST /predict/explain
POST /llm/chat

History

GET /history?limit=N
GET /history/summary

Admin

When ENABLE_ADMIN=true:

POST /admin/reload
POST /admin/retrain
POST /admin/promote/{job_id}

Repository Map

backend/
  main.py                      FastAPI app and runtime orchestration
  builddataset.py              Canonical dataset build entrypoint
  train_models.py              Canonical training entrypoint
  prediction_store.py          User-scoped history persistence
  sqlite_store.py              SQLite-backed prediction history
  app/core/settings.py         Environment settings and path resolution
  data/
    datasets/
      latest_dataset.json
      runs/<timestamp>/
    models/
      current/

frontend/
  src/
    App.jsx
    api/client.js
    hooks/usePredictionState.js
    components/DashBoard/Dashboard.jsx
    components/HistoryPage.jsx
    pages/StatsPage.jsx
  public/
    schedules/

Useful Docs

Verification

Recommended checks after backend or frontend changes:

python -m pytest backend/tests -q
cd frontend && npm test -- --run && npm run build

Runtime smoke checks:

curl http://127.0.0.1:8000/health
curl http://127.0.0.1:8000/status/overview -H "X-User-Id: analyst@example.com"
curl -X POST http://127.0.0.1:8000/predict ^
  -H "Content-Type: application/json" ^
  -d "{\"home_team\":\"KC\",\"away_team\":\"BUF\",\"season\":2025,\"week\":15}"

Troubleshooting

`/predict` returns `503`

Check /status/models for readiness blockers.
Confirm MODELS_DIR points at a complete bundle.
If the bundle was trained under a different scikit-learn version, retrain or align the runtime environment.

The frontend loads but some pages look empty

Confirm Vercel VITE_API_BASE_URL points at the canonical Heroku backend URL.
In local dev, prefer VITE_API_DEV=http://127.0.0.1:8000.
Older deployments may not expose /history/summary or queryable /schedule; the frontend now falls back, but a backend redeploy is still the clean fix.

Training seems to use the wrong dataset

Inspect backend/data/datasets/latest_dataset.json.
Override explicitly when needed:

python backend/train_models.py --data backend/data/datasets/<your_clean_dataset>.csv

Local schedule lookups return nothing

Make sure backend/data/Nfl_schedule_2025.csv exists, or set SCHEDULE_PATH.
The frontend also ships fallback CSVs under frontend/public/schedules/ for compatibility with older backends.

Name		Name	Last commit message	Last commit date
Latest commit History 690 Commits
.agent		.agent
.github		.github
.vscode		.vscode
archive		archive
artifacts		artifacts
backend		backend
docs		docs
frontend		frontend
node_modules		node_modules
tmp_pytest		tmp_pytest
.buildpacks		.buildpacks
.gitignore		.gitignore
.python-version		.python-version
.slugignore		.slugignore
.vercelignore		.vercelignore
Procfile		Procfile
README.md		README.md
alfred.log.md		alfred.log.md
app.json		app.json
audit_inference.py		audit_inference.py
conftest.py		conftest.py
debug_entries.py		debug_entries.py
frontend_dev.err		frontend_dev.err
frontend_dev.out		frontend_dev.out
frontend_dev2.err		frontend_dev2.err
frontend_dev2.out		frontend_dev2.out
heroku.yml		heroku.yml
main.py		main.py
offseason.patch		offseason.patch
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
sync_data.py		sync_data.py
sync_direct.py		sync_direct.py
sync_season.py		sync_season.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NFL ML Predictions

Canonical Deploy Targets

What This Repo Actually Does

Quick Start

1. Install dependencies

2. Build the canonical dataset

3. Train models

4. Start the backend

5. Start the frontend

Runtime Behavior That Matters

Prediction readiness is allowed to degrade

Schedule loading is queryable and postseason-safe

History is user-scoped

Frontend Architecture In One Minute

Key Endpoints

Health and status

Schedule and prediction

History

Admin

Repository Map

Useful Docs

Verification

Troubleshooting

`/predict` returns `503`

The frontend loads but some pages look empty

Training seems to use the wrong dataset

Local schedule lookups return nothing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NFL ML Predictions

Canonical Deploy Targets

What This Repo Actually Does

Quick Start

1. Install dependencies

2. Build the canonical dataset

3. Train models

4. Start the backend

5. Start the frontend

Runtime Behavior That Matters

Prediction readiness is allowed to degrade

Schedule loading is queryable and postseason-safe

History is user-scoped

Frontend Architecture In One Minute

Key Endpoints

Health and status

Schedule and prediction

History

Admin

Repository Map

Useful Docs

Verification

Troubleshooting

/predict returns 503

The frontend loads but some pages look empty

Training seems to use the wrong dataset

Local schedule lookups return nothing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`/predict` returns `503`

Packages