Skip to content

zk112211/Pedestrian_Attribute_Detection_Pipeline

Repository files navigation

Pedestrian Attribute Detection Pipeline

A unified pipeline for pedestrian detection, tracking, and attribute analysis, integrating TOPICTrack, upar_hdt, and deepface-style modules. It performs full-body attributes (UPAR), face detection, and age/gender estimation, with optional crowd and fan statistics.


Features

  • Multi-person tracking: TOPICTrack (YOLOX + OC-SORT) for detection and track-by-id
  • Body attributes: UPAR for clothing, accessories, gender, etc.
  • Face attributes: YOLOFace v2 + age/gender models
  • Batch processing: Frame folders per video with configurable batch size
  • Analysis & stats: Per-ID attribute ratios, fan detection (e.g. young female, no suitcase)
  • Visualization: Per-frame images and rendered videos with optional analysis overlay

Table of Contents


Requirements

  • Python 3.8+
  • CUDA (for GPU); PyTorch and TensorFlow versions should match your CUDA version
  • Conda recommended

Check your environment (Python, NVIDIA driver, CUDA, cuDNN, PyTorch, TensorFlow):

python check_env.py

Installation

  1. Clone the repository and create a conda environment:
git clone <repo_url>
cd topic-upar-face-pipeline-master
conda create -n attr_analyze python=3.8 -y
conda activate attr_analyze
  1. Install dependencies and submodules:
pip install -r requirements.txt
cd TOPICTrack/external/YOLOX && python setup.py develop
cd ../fast_reid && pip install -e .
cd ../../..

This project uses fast_reid for embedding in TOPICTrack. If your clone uses deep-person-reid instead, run cd ../deep-person-reid && python setup.py develop in place of the fast_reid step. Install PyTorch and TensorFlow separately to match your CUDA version if needed.


Model Weights

Location Contents
TOPICTrack/external/weights/ TOPICTrack / YOLOX detection and reid weights (e.g. topictrack_mot20.tar, mot20_sbs_S50.pth)
checkpoints/ UPAR, YOLOFace, age, gender and any other project-specific weights

Place the required .pth, .tar, or other weight files as expected by the code in those directories.


Data Layout

Input frames must follow this structure (the parent of each sequence must be named test):

imgs/
└── demo1/                    # arbitrary name (e.g. dataset or experiment)
    └── test/
        └── seq1/             # one folder per video/sequence
            ├── 000001.jpg
            ├── 000002.jpg
            └── ...
        └── seq2/
            └── ...

Quick Start

  1. Generate COCO-style annotations for tracking (optional; some flows may use pre-generated annotations):
python convert_data_to_coco.py --data_path imgs/demo1/test
  1. Run the main pipeline (tracking → crop → UPAR + face attributes → JSON per frame):
python analyze_batch.py --input_dir ./imgs/demo1 --output_dir ./batch_output --batch_size 16
  1. Optional: run analysis (per-ID ratios, fan stats):
python output_analyze.py --input_dir ./batch_output
  1. Visualize (images and/or video):
python visualize.py --json_root ./batch_output --img_root ./imgs/demo1 --vis_root ./vis_output --save_img --save_video

With analysis and fan info:

python visualize.py --json_root ./batch_output --img_root ./imgs/demo1 --vis_root ./vis_output --save_video --use_analysis --fans_info

Script Reference

Script Purpose
convert_data_to_coco.py Build COCO-style annotations under data_path (parent dir must be test).
analyze_batch.py Full pipeline: TOPICTrack → UPAR + face_attr → one JSON per frame per video.
output_analyze.py Aggregate per-ID attributes and fan stats; writes analysis/ under each video dir.
visualize.py Draw bboxes and attributes; optionally use analysis and fan stats.
video_to_frames.py Extract frames from videos into input_dir/test/<video_name>/.
check_env.py Print Python, CUDA, PyTorch, TensorFlow, and GPU info.

convert_data_to_coco.py

Argument Default Description
--data_path imgs/demo1/test Path to sequences; parent of this path should be named test.

analyze_batch.py

Argument Default Description
--input_dir ./imgs/demo1 Root directory containing test/<seq>/ frame folders.
--output_dir ./batch_output Output root; one subfolder per video with XXXXXX.json per frame.
--video_name None Process only this video (folder name under test/); if omitted, process all.
--batch_size 16 Batch size for attribute models; tune for your GPU.

output_analyze.py

Argument Default Description
--input_dir batch_output Root directory containing one subfolder per video (with frame JSONs).

Creates under each video folder an analysis/ directory with e.g. analyze_result.json, bboxs.txt, fans_stats.json.

visualize.py

Argument Default Description
--json_root batch_output Root of per-video JSON outputs.
--img_root imgs/demo1 Root of input images (contains test/<seq>/).
--vis_root vis_output Where to write visualization images/videos.
--save_img flag Save per-frame images.
--save_video flag Render output video(s).
--no_save flag Debug only; do not write images or video.
--video_fps 25 Output video FPS.
--video_codec mp4v Video codec.
--video_name None Visualize only this video.
--use_analysis flag Use analysis/analyze_result.json (and related) for drawing.
--fans_info flag Use analysis/fans_stats.json to show fan-related info.

video_to_frames.py

python video_to_frames.py <input_dir>

Extracts frames from every video under input_dir (recursive) into input_dir/test/<video_stem>/ (e.g. 000001.jpg, 000002.jpg, ...). Use this before running the pipeline if your source is video files.


Project Structure

topic-upar-face-pipeline-master/
├── README.md
├── requirements.txt
├── check_env.py
├── convert_data_to_coco.py
├── analyze_batch.py          # main pipeline
├── output_analyze.py         # analysis & fan stats
├── visualize.py
├── video_to_frames.py
├── frame_loader.py
├── face_attr/                # face detection + age/gender
│   ├── deepface.py
│   ├── preprocessing.py
│   └── yolofacev2.py
├── models/                   # YOLOFace, Age, Gender clients
│   ├── yolo.py, Age.py, Gender.py, ...
│   └── backbone/
├── upar/                     # body attribute inference
│   └── infer.py
├── TOPICTrack/               # detection + tracking
│   ├── TOPICTrack.py
│   ├── external/
│   │   ├── YOLOX/
│   │   ├── fast_reid/ or deep-person-reid/
│   │   └── weights/
│   └── trackers/
├── checkpoints/              # place UPAR, face, age, gender weights here
└── utils/

Pipeline Overview

  1. TOPICTrack
    Reads frames from input_dir/test/<video>/, runs detection and tracking, and outputs per-frame tracks (crop coordinates and track IDs).

  2. Attribute modules
    Each tracked person crop is passed to:

    • UPAR (upar/): body attributes (clothing, accessories, etc.).
    • face_attr (face_attr/ + models/): face detection (YOLOFace v2), then age and gender.
  3. Output
    Results are merged and written as one JSON per frame under output_dir/<video>/, ready for output_analyze.py and visualize.py.


Analysis & Visualization

  • output_analyze.py
    For each video directory under --input_dir, it builds per-ID attribute ratios (e.g. ratio = frames where a label is detected / frames where that ID appears). It writes:

    • analysis/analyze_result.json: per-ID attributes and ratios.
    • analysis/bboxs.txt: bbox per ID/frame.
    • analysis/fans_stats.json: fan definition (e.g. young female without suitcase) and counts.

    You can change ratio thresholds and fan criteria inside analyze_output() in output_analyze.py.

  • visualize.py
    Renders bounding boxes and attributes on frames. Use --use_analysis to show aggregated analysis, and --fans_info to overlay fan statistics from fans_stats.json.


Video Input

If you start from video files instead of image folders:

  1. Put videos in a directory (e.g. my_videos/).

  2. Run:

    python video_to_frames.py my_videos
  3. Use my_videos as the frame root: e.g. --input_dir my_videos for analyze_batch.py and --img_root my_videos for visualize.py, with frames under my_videos/test/<video_name>/.


License

See the repository for license information of this project and its submodules (TOPICTrack, UPAR, etc.).

About

End-to-end pedestrian detection, tracking & attribute analysis pipeline integrating TOPICTrack, UPAR, and YOLOFaceV2. Features fan crowd detection, batch processing, and visualization. Achieved 98% detection accuracy with fine-tuned 400M-parameter backbone.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages