Full-stack NFL forecasting workspace with a FastAPI backend, a React/Vite frontend, a dataset build pipeline, and a model training pipeline.
- Frontend: Vercel project
nfl-ml-predictions - Production frontend alias:
https://new-nfl-predict.vercel.app - Backend: Heroku app
nfl-predict - Canonical backend origin:
https://nfl-predict-ecf5a5bd34fe.herokuapp.com
Deploy intent:
- Vercel should build from
frontend/ - Heroku should serve the FastAPI backend with the buildpack +
Procfileflow - Production CORS should allow the canonical frontend origin plus
.vercel.apppreviews
- Serves NFL schedule, health, status, prediction, and history endpoints from
backend/main.py. - Stores user-scoped prediction history in SQLite first, with JSON files as a fallback.
- Builds cleaned training datasets into
backend/data/datasets/. - Trains score and win-probability models and promotes bundles for serving.
- Ships a React app with a protected dashboard, history view, and status page.
python -m pip install -r requirements.txt
cd frontend
npm install
cd ..python backend/builddataset.py --start 2018 --end 2025 --out-dir backend/data/datasetsWhat this writes:
- A dated run folder in
backend/data/datasets/runs/<timestamp>/ - A promoted clean CSV in
backend/data/datasets/ backend/data/datasets/latest_dataset.json
python backend/train_models.pyWhat this writes by default:
- Promoted artifacts in
backend/models/ - A staging bundle in
backend/models/staging/<run_id>/ metadata.json,training_report.json, andrun_summary.json- A dated mirror in
backend/YYYYMMDD/models/when training uses the default output directory
Important runtime note:
- Training still writes to
backend/models/by default. - Serving prefers
MODELS_DIRwhen set, thenbackend/data/models/current, thenbackend/data/models, then packaged fallbacks, and finallybackend/models. - That split is intentional so deployments can serve a promoted bundle while local training experiments stay isolated.
uvicorn backend.main:app --reload --host 127.0.0.1 --port 8000cd frontend
npm run devOpen http://localhost:3000.
The backend now boots even if models are missing or incompatible.
/health,/status/models,/schedule, and/historystill come up./predictreturns503with structured blockers when the active bundle is not ready.- This makes deployments diagnosable instead of failing hard during startup.
GET /schedule?season=<year>&week=<week>returns a specific slate.GET /schedule/next-weekremains the compatibility route for "next slate".- When no future regular-season game exists, the backend falls back to the latest available slate, including postseason weeks.
The frontend sends X-User-Id, and the backend uses that to isolate prediction history.
- Primary store: SQLite-backed history and summary metrics
- Fallback: JSON ledgers under
backend/Predictions/users/<user-storage-key>/
frontend/src/App.jsxcreates the auth session and shared prediction state once.frontend/src/hooks/usePredictionState.jsowns schedule, health, history, summary, logos, and prediction maps.frontend/src/components/DashBoard/Dashboard.jsxconsumes that shared state instead of shadowing it locally.frontend/src/api/client.jsis the active transport and compatibility layer.frontend/src/api/fetch.jsis a legacy helper kept for older experiments and is not the main app path.
GET /healthGET /status/overviewGET /status/modelsGET /status/runtime
GET /scheduleGET /schedule/next-weekGET /teams/logosPOST /predictPOST /predict/explainPOST /llm/chat
GET /history?limit=NGET /history/summary
When ENABLE_ADMIN=true:
POST /admin/reloadPOST /admin/retrainPOST /admin/promote/{job_id}
backend/
main.py FastAPI app and runtime orchestration
builddataset.py Canonical dataset build entrypoint
train_models.py Canonical training entrypoint
prediction_store.py User-scoped history persistence
sqlite_store.py SQLite-backed prediction history
app/core/settings.py Environment settings and path resolution
data/
datasets/
latest_dataset.json
runs/<timestamp>/
models/
current/
frontend/
src/
App.jsx
api/client.js
hooks/usePredictionState.js
components/DashBoard/Dashboard.jsx
components/HistoryPage.jsx
pages/StatsPage.jsx
public/
schedules/
Recommended checks after backend or frontend changes:
python -m pytest backend/tests -q
cd frontend && npm test -- --run && npm run buildRuntime smoke checks:
curl http://127.0.0.1:8000/health
curl http://127.0.0.1:8000/status/overview -H "X-User-Id: analyst@example.com"
curl -X POST http://127.0.0.1:8000/predict ^
-H "Content-Type: application/json" ^
-d "{\"home_team\":\"KC\",\"away_team\":\"BUF\",\"season\":2025,\"week\":15}"- Check
/status/modelsfor readiness blockers. - Confirm
MODELS_DIRpoints at a complete bundle. - If the bundle was trained under a different scikit-learn version, retrain or align the runtime environment.
- Confirm Vercel
VITE_API_BASE_URLpoints at the canonical Heroku backend URL. - In local dev, prefer
VITE_API_DEV=http://127.0.0.1:8000. - Older deployments may not expose
/history/summaryor queryable/schedule; the frontend now falls back, but a backend redeploy is still the clean fix.
- Inspect
backend/data/datasets/latest_dataset.json. - Override explicitly when needed:
python backend/train_models.py --data backend/data/datasets/<your_clean_dataset>.csv- Make sure
backend/data/Nfl_schedule_2025.csvexists, or setSCHEDULE_PATH. - The frontend also ships fallback CSVs under
frontend/public/schedules/for compatibility with older backends.