Object Detection for Group-hosed Swine (2025 CVPDL HW1)

Swine Detection Resoult

This project is a comprehensive solution for the "Object Detection for Group-hosed Swine" assignment in the NTU CVPDL course. It implements a high-performance training and inference pipeline using a Faster R-CNN model with a ConvNeXt backbone, incorporating modern deep learning best practices for achieving high mAP scores.

The entire workflow is managed by a central runner script, main.py, which automates data pre-processing, training, and prediction generation.

Key Features

Powerful Model Architecture: Utilizes a Faster R-CNN detector with a pre-trained ConvNeXt backbone and Feature Pyramid Network (FPN) for robust feature extraction.
Automated "One-Key" Workflow: The main.py script serves as a single entry point to run the entire pipeline, from optional data clustering to training and final submission file generation.
Advanced Training Strategies:
- Mixed-Precision Training (AMP): Leverages torch.amp to accelerate training and reduce GPU memory usage.
- Gradient Accumulation: Enables training with large effective batch sizes even on memory-constrained GPUs.
- Advanced LR Scheduling: Implements a warm-up phase followed by a Cosine Annealing schedule to stabilize training and improve convergence.
Smart Data Handling:
- Automatic Data Clustering: The main.py script can automatically run a pre-processing script (scripts/build_clusters.py) to cluster images by scene type (e.g., daytime, infrared).
- Stratified Splitting: Uses the generated clusters to perform a stratified train/validation split, ensuring the validation set is representative.
- Cluster-Aware Augmentations: Applies different, optimized data augmentation pipelines based on the image cluster, improving model robustness.
Flexible Post-processing: Supports standard NMS and advanced Soft-NMS during inference, configurable via command-line arguments.
Reproducibility: Enforces strict seeding across all components (PyTorch, NumPy, DataLoaders) to ensure experiments are fully reproducible.
Advanced Visualization Toolkit: Includes a visualize.py script to inspect model predictions, featuring modes to view images with the most/least detections, random samples, or specific IDs for in-depth error analysis.

Project Structure

.
├── data/                  # Placeholder for dataset files
│   ├── train/
│   │   ├── img/
│   │   └── gt.txt
│   └── test/
│       └── img/
├── models/                # Output directory for models, logs, and configs
├── scripts/
│   └── build_clusters.py  # Standalone script for image clustering
├── submissions/           # Output directory for Kaggle submission files
├── visualizations/        # Output directory for generated visualizations
├── src/                   # Source code
│   ├── dataset.py
│   ├── engine.py
│   ├── model.py
│   ├── predict.py
│   ├── train.py
│   └── ...
├── main.py                # << Main entry point for the project
├── visualize.py           # << Standalone script for visualizing results
├── requirements.txt       # Project dependencies
└── README.md

Setup and Installation

1. Clone the Repository

cd <directory-name>

2. Create a Python Environment

It is highly recommended to use a virtual environment (e.g., venv or conda) to manage dependencies. This project requires Python >= 3.10.

# Using conda
conda create -n cvpdl_hw1 python=3.10
conda activate cvpdl_hw1

3. Install PyTorch

PyTorch installation depends on your system's CUDA version. It is intentionally excluded from requirements.txt to ensure a correct installation. Please visit the official PyTorch website to find the appropriate command for your setup.

Example for CUDA 12.4:

conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia

4. Install Other Dependencies

Once PyTorch is installed, install the remaining packages using the provided requirements.txt file.

pip install -r requirements.txt

5. Download and Place the Dataset

Download the competition dataset from Kaggle and place it in the data/ directory, following the structure outlined in the "Project Structure" section above.

Usage: Training and Prediction

This project is designed around the main.py script, which automates the entire process.

Step 1: Run the Full Pipeline (Training + Prediction)

The simplest way to get started is to run main.py with its default configuration. This will:

Check if cluster-aware training is enabled (default: True).
If so, automatically run scripts/build_clusters.py to generate image_clusters.csv if it doesn't exist.
Train the model using the parameters defined in the CONFIG dictionary.
Save the best-performing model checkpoint, training logs, and configuration to the models/ directory.
Automatically use the best model to run inference on the test set.
Generate a final submission.csv file in the submissions/ directory.

python main.py

Step 2: Check the Outputs

After the script finishes, you will find:

Your best model at models/best_model_seed_<seed>.pth or the others check points models.
A detailed training log at models/training_log_seed_<seed>.csv.
The final submission file at submissions/submission_seed_<seed>_pp_<method>.csv.

Advanced Usage and Customization

You can easily override any of the default settings from main.py by providing command-line arguments.

Example: Train a ConvNeXt-Small model with a different batch size and NMS post-processing:

python main.py \
    --backbone convnext_small \
    --batch_size 2 \
    --accumulation_steps 8 \
    --lr 0.00005 \
    --post_processing nms \
    --nms_iou_threshold 0.5

Example: Train without using cluster-aware augmentations:

python main.py --no-use_cluster_aware

To see a full list of configurable parameters, run:

python main.py --help

Running Standalone Scripts

While main.py is the recommended entry point, individual scripts can also be run for debugging or specific tasks:

Generate Clusters Only:

python -m scripts.build_clusters --image_dir ./data/train/img --output_csv image_clusters.csv --visualize

Run Training Only:

python -m src.train --data_root ./data --output_dir ./models --epochs 160

Run Prediction Only (requires a trained model):

python -m src.predict --data_root ./data --model_path ./models/best_model_seed_<Seed>.pth --output_path ./submissions/my_submission.csv

Visualizing Results

The visualize.py script is a powerful tool for inspecting the model's performance on the test set. It reads a generated submission CSV, draws the predicted bounding boxes on the corresponding test images, and saves the results as PNG files.

This is invaluable for:

Error Analysis: Why did the model miss detections? Where does it produce false positives?
Confidence Tuning: Observe the quality of low-confidence vs. high-confidence boxes.
Report Generation: Create high-quality figures for your final report.

Basic Usage

To visualize results, you must first have a submission.csv file generated by main.py or predict.py.

The script requires the path to the submission file as a positional argument. The results will be saved in a new directory under visualizations/.

# General command structure
python visualize.py <path_to_your_submission.csv> [OPTIONS]

Visualization Modes

The script offers several modes to select which images to visualize. Use only one of these options at a time.

1. Visualize a Default Set: If no specific mode is chosen, the script will visualize a predefined set of interesting images.

python visualize.py submissions/submission_seed_42_pp_none.csv

2. Visualize K Random Images: A great way to get an unbiased overview of the model's typical performance.

python visualize.py submissions/submission.csv --random_k 10

3. Visualize Images with the Most Detections: Useful for inspecting crowded scenes and testing the effectiveness of NMS.

python visualize.py submissions/submission.csv --top_k_most 5

4. Visualize Images with the Fewest Detections (but > 0): Helps identify scenes where the model is hesitant or only finds the most obvious objects.

python visualize.py submissions/submission.csv --top_k_least 5

5. Visualize Images with Zero Detections: Crucial for finding false negatives. Why did the model completely fail on these images?

python visualize.py submissions/submission.csv --no_detections 5

6. Visualize Specific Image IDs: Perfect for a targeted analysis of specific problematic or interesting images you've identified.

python visualize.py submissions/submission.csv --fixed_ids 25 950 1864

After running, check the visualizations/viz_<submission_file_name>/ directory for the output images.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object Detection for Group-hosed Swine (2025 CVPDL HW1)

Key Features

Project Structure

Setup and Installation

1. Clone the Repository

2. Create a Python Environment

3. Install PyTorch

4. Install Other Dependencies

5. Download and Place the Dataset

Usage: Training and Prediction

Step 1: Run the Full Pipeline (Training + Prediction)

Step 2: Check the Outputs

Advanced Usage and Customization

Running Standalone Scripts

Visualizing Results

Basic Usage

Visualization Modes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 168 Commits
assets		assets
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
visualize.py		visualize.py

Folders and files

Latest commit

History

Repository files navigation

Object Detection for Group-hosed Swine (2025 CVPDL HW1)

Key Features

Project Structure

Setup and Installation

1. Clone the Repository

2. Create a Python Environment

3. Install PyTorch

4. Install Other Dependencies

5. Download and Place the Dataset

Usage: Training and Prediction

Step 1: Run the Full Pipeline (Training + Prediction)

Step 2: Check the Outputs

Advanced Usage and Customization

Running Standalone Scripts

Visualizing Results

Basic Usage

Visualization Modes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages