Broadcast NBA frame/video pipeline for:
- player + ball detection
- ball-handler inference
- painted-area segmentation
- half-court homography fitting
- mapping the ball handler into top-down court coordinates
The project currently works best as an inference pipeline over single frames, folders of frames, or video.
There is also a lightweight React frontend for uploading videos, polling jobs, and fetching results:
Main pieces:
- detect.py: runs the player/ball detector
- possession.py: picks the likely ball handler
- paint_homography.py: fits a homography from the painted area
- run_frame.py: processes one frame
- run_frames.py: processes a folder of frames
- run_video.py: processes a video and writes an overlayed output video
- segment_paint_yolo.py: standalone paint-seg + homography testing utility
This repo expects two trained models at inference time:
- a player/ball detector
- a paint segmentation model
The weight files are local artifacts and are not tracked in git, so you should pass your own paths when running inference.
The court coordinates are currently hoop-centered and based on the visible half-court paint.
The paint corners map to:
- left baseline lane corner:
(-8.0, -5.25) - right baseline lane corner:
(8.0, -5.25) - left free-throw lane corner:
(-8.0, 13.75) - right free-throw lane corner:
(8.0, 13.75)
player_foot_court_xy is the detected ball handler's image-space foot point projected into that court coordinate system.
Requirements are minimal right now:
pip install -r requirements.txtCurrent dependencies:
pandasopencv-pythonultralyticsfastapiuvicornpython-multipart
The repo now includes a FastAPI service for the UI and local job-based video inference flow:
Run it locally with:
uvicorn api.main:app --host 0.0.0.0 --port 8080 --reloadUseful environment variables:
PLAYER_MODEL_PATHPAINT_MODEL_PATHJOB_TTL_SECONDSALLOWED_ORIGINS
The API currently supports:
POST /jobsUpload a video plus inference options and queue a background job.GET /jobs/{job_id}Read job status.GET /jobs/{job_id}/resultsFetch inference results in eitherFULLorPOSSESSION_ONLYmode.GET /jobs/{job_id}/overlay-videoDownload the overlay video whensave_overlayswas requested.
Notes:
- uploaded videos and outputs are stored under
jobs/<job_id>/... - jobs are temporary and cleaned up after a TTL
- the UI currently talks to this API on
http://localhost:8080
.\.venv\Scripts\python.exe inference\run_frame.py
--image ".\notebooks\test-frames\game5_0109.jpg"
--model "<path-to-player-ball-model.pt>"
--paint-model "<path-to-paint-seg-model.pt>"
--paint-basket-side left
--out_image ".\outputs\game5_0109.jpg"Important output field:
"possession": {
"player_foot_court_xy": [-5.38, -1.29]
}.\.venv\Scripts\python.exe inference\run_frames.py
--source ".\notebooks\test-frames"
--model "<path-to-player-ball-model.pt>"
--paint-model "<path-to-paint-seg-model.pt>"
--paint-basket-side left
--output ".\runs\run-frames"
--save-overlaysOutputs:
runs/run-frames/json/*.jsonruns/run-frames/results.jsonruns/run-frames/overlays/*.jpgif--save-overlaysis set
.\.venv\Scripts\python.exe inference\run_video.py
--video ".\runs\harden_to_allen.mp4"
--model "<path-to-player-ball-model.pt>"
--paint-model "<path-to-paint-seg-model.pt>"
--paint-basket-side left
--output ".\runs\run-video"Outputs:
runs/run-video/results.jsonruns/run-video/json/frame_000000.json, etc.runs/run-video/harden_to_allen_overlay.mp4
--frame-step is supported and preserves video timing in the overlay video by writing skipped frames unchanged.
Tracked example assets:
- before video: examples/harden_to_allen/harden_to_allen_before.mp4
- after video: examples/harden_to_allen/harden_to_allen_after.mp4
| Before | After |
|---|---|
![]() |
![]() |
The counts below come from the most recent local run of harden_to_allen.mp4:
frames_seen:142frames_processed:142ball_detected:103possession_found:86court_xy_found:86paint_homography_available:142
Meaning:
- paint detection/homography is stable across the full clip
- possession is still the limiting stage
- court coordinates are only produced when possession returns a specific handler
For each processed frame:
- Detect players and ball.
- Infer likely possession using ball-to-player proximity.
- Segment the painted area.
- Fit a paint homography from the predicted paint mask.
- Estimate the handler foot as the bottom-center of the handler bbox.
- Project that point into court coordinates.
- The player foot point is approximated as bbox bottom-center.
- If a player is truncated at the bottom of the frame, the inferred foot location can be wrong.
- Possession currently fails on some frames where the ball is in the air or the handler assignment is ambiguous.
- Paint homography is intentionally rejected when the paint is too close to the image border.
- Court coordinates are only as good as all upstream steps: detection, possession, paint segmentation, and homography.
The biggest current quality issue is player truncation near the bottom/bench-side edge, because that shifts the perceived foot position.
- improve player detection labels and retrain
- improve handler foot-point quality for truncated players
- add jersey-number reading and player identity matching from a known on-court player list
- eventually aggregate identity across multiple frames instead of single-frame guesses
This repo is still in active experimentation mode. The README reflects the current working path.

