Skip to content

odera143/ShotVision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NBA AI

Broadcast NBA frame/video pipeline for:

  • player + ball detection
  • ball-handler inference
  • painted-area segmentation
  • half-court homography fitting
  • mapping the ball handler into top-down court coordinates

Current State

The project currently works best as an inference pipeline over single frames, folders of frames, or video.

There is also a lightweight React frontend for uploading videos, polling jobs, and fetching results:

Main pieces:

Models

This repo expects two trained models at inference time:

  • a player/ball detector
  • a paint segmentation model

The weight files are local artifacts and are not tracked in git, so you should pass your own paths when running inference.

Coordinate System

The court coordinates are currently hoop-centered and based on the visible half-court paint.

The paint corners map to:

  • left baseline lane corner: (-8.0, -5.25)
  • right baseline lane corner: (8.0, -5.25)
  • left free-throw lane corner: (-8.0, 13.75)
  • right free-throw lane corner: (8.0, 13.75)

player_foot_court_xy is the detected ball handler's image-space foot point projected into that court coordinate system.

Install

Requirements are minimal right now:

pip install -r requirements.txt

Current dependencies:

  • pandas
  • opencv-python
  • ultralytics
  • fastapi
  • uvicorn
  • python-multipart

API

The repo now includes a FastAPI service for the UI and local job-based video inference flow:

Run it locally with:

uvicorn api.main:app --host 0.0.0.0 --port 8080 --reload

Useful environment variables:

  • PLAYER_MODEL_PATH
  • PAINT_MODEL_PATH
  • JOB_TTL_SECONDS
  • ALLOWED_ORIGINS

The API currently supports:

  • POST /jobs Upload a video plus inference options and queue a background job.
  • GET /jobs/{job_id} Read job status.
  • GET /jobs/{job_id}/results Fetch inference results in either FULL or POSSESSION_ONLY mode.
  • GET /jobs/{job_id}/overlay-video Download the overlay video when save_overlays was requested.

Notes:

  • uploaded videos and outputs are stored under jobs/<job_id>/...
  • jobs are temporary and cleaned up after a TTL
  • the UI currently talks to this API on http://localhost:8080

Run Inference

Single Frame

.\.venv\Scripts\python.exe inference\run_frame.py
  --image ".\notebooks\test-frames\game5_0109.jpg"
  --model "<path-to-player-ball-model.pt>"
  --paint-model "<path-to-paint-seg-model.pt>"
  --paint-basket-side left
  --out_image ".\outputs\game5_0109.jpg"

Important output field:

"possession": {
  "player_foot_court_xy": [-5.38, -1.29]
}

Folder Of Frames

.\.venv\Scripts\python.exe inference\run_frames.py
  --source ".\notebooks\test-frames"
  --model "<path-to-player-ball-model.pt>"
  --paint-model "<path-to-paint-seg-model.pt>"
  --paint-basket-side left
  --output ".\runs\run-frames"
  --save-overlays

Outputs:

  • runs/run-frames/json/*.json
  • runs/run-frames/results.json
  • runs/run-frames/overlays/*.jpg if --save-overlays is set

Video

.\.venv\Scripts\python.exe inference\run_video.py
  --video ".\runs\harden_to_allen.mp4"
  --model "<path-to-player-ball-model.pt>"
  --paint-model "<path-to-paint-seg-model.pt>"
  --paint-basket-side left
  --output ".\runs\run-video"

Outputs:

  • runs/run-video/results.json
  • runs/run-video/json/frame_000000.json, etc.
  • runs/run-video/harden_to_allen_overlay.mp4

--frame-step is supported and preserves video timing in the overlay video by writing skipped frames unchanged.

Example: harden_to_allen.mp4

Tracked example assets:

Before After
Before clip After clip

The counts below come from the most recent local run of harden_to_allen.mp4:

  • frames_seen: 142
  • frames_processed: 142
  • ball_detected: 103
  • possession_found: 86
  • court_xy_found: 86
  • paint_homography_available: 142

Meaning:

  • paint detection/homography is stable across the full clip
  • possession is still the limiting stage
  • court coordinates are only produced when possession returns a specific handler

How The Pipeline Works

For each processed frame:

  1. Detect players and ball.
  2. Infer likely possession using ball-to-player proximity.
  3. Segment the painted area.
  4. Fit a paint homography from the predicted paint mask.
  5. Estimate the handler foot as the bottom-center of the handler bbox.
  6. Project that point into court coordinates.

Current Limitations

  • The player foot point is approximated as bbox bottom-center.
  • If a player is truncated at the bottom of the frame, the inferred foot location can be wrong.
  • Possession currently fails on some frames where the ball is in the air or the handler assignment is ambiguous.
  • Paint homography is intentionally rejected when the paint is too close to the image border.
  • Court coordinates are only as good as all upstream steps: detection, possession, paint segmentation, and homography.

The biggest current quality issue is player truncation near the bottom/bench-side edge, because that shifts the perceived foot position.

Near-Term Roadmap

  • improve player detection labels and retrain
  • improve handler foot-point quality for truncated players
  • add jersey-number reading and player identity matching from a known on-court player list
  • eventually aggregate identity across multiple frames instead of single-frame guesses

Repo Notes

This repo is still in active experimentation mode. The README reflects the current working path.

About

Computer vision pipeline that turns basketball broadcast footage into real-time possession and shot probability analytics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors