Spatio-temporal Action Detection

This repo is the code of the Spatio-temporal action detecion project which implement the tubelet-level transformer-based spatio-temporal action detection model--TubeR.
This project will train and evaluate TubeR models on AVA2.1 and JHMDB datasets. For more details of the projects, check out the dissertation and the intro video. The project will continue to be updated on github.

Results

Dataset	Backbone	Pre-train	#view	IoU	mAP
AVA2.1	CSN-152	Kinetics-400+IG65M	1 view	0.5	0.29
JHMDB-21	CSN-152	Kinetics-400+IG65M	1 view	0.5	0.72

Dependency

The project is tested working on:

Python 3.7.12
Torch 1.12.1(initial) and Torch 2.0.0(updated)
CUDA 12.1
timm==0.4.5
tensorboardX

Dataset and pre-train model

To run the code of this project, you should download following file from github or given resource, since the code limit 250M on moodle. Here is the github link: https://github.com/YihangChen9/Spatio_temporal-action-detection

Please download the asset.zip(annotation of the AVA dataset) from the google drive. Then unzip into datasets/ directory
Please download the pre-train.zip(trained model) directory from the the google drive. then put under tuber directory.
TubeR_CSN152_JHMDB.pth is the trained model for JHMDB
TubeR_CSN152_AVA.pth is the trained model for JHMDB
irCSN_152_ft_kinetics_from_ig65m_f126851907.mat is the pre-train model of CSN152 as backbone
detr.pth initial transformer parameter setting for AVA TubeR
You can get JHMDB in this link, please download JHMDB.tar.gz. JHMDB-GT.pkl is the annotaiton file.
To get AVA dataset, please run the three bash scripts in the datasets directory one by one(please set the path in bash file):
download_ava.sh(down the original video clip)
chunk_video.sh(chunk the clip from 15 min to 30 min)
extract_frame.sh(extract frame from the clip)

Evaluation

To evaluate the model, first modify the config file(in configuration,one for AVA, another one for JHMDB):

set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup.
set the LABEL_PATH, ANNO_PATH, DATA_PATH to your local directory accordingly.
Download the pre-trained model and set PRETRAINED_PATH to model path.
make sure LOAD and LOAD_FC are set to True

Then set path to tuber and run:

# run evaluation
python3  eval_tuber_jhmdb.py
python3  eval_tuber_ava.py

Training

To train TubeR from scratch, first modify the configfile:

set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup.
set the LABEL_PATH, ANNO_PATH, DATA_PATH to your local directory accordingly.
Download the pre-trained feature backbone and transformer weights and set PRETRAIN_BACKBONE_DIR , PRETRAIN_TRANSFORMER_DIR(only for AVA dataset) accordingly.
make sure LOAD and LOAD_FC are set to False

Then run:

# run training from scratch
python3  train_tuber_jhmdb.py 
python3  train_tuber_ava.py

Test(prototype)

To Test, first modify the config file(in configuration,one for AVA, another one for JHMDB):

set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup.
set the ANNO_PATH, DATA_PATH(video/JHMDB/test_frames) and TEST_PATH(path of the input video. put only one avi or mp4 video,please) to your local directory accordingly.
Download the pre-trained model and set PRETRAINED_PATH to model path.
make sure LOAD and LOAD_FC are set to True

Then set path to tuber and run:

# run evaluation
python3  detection_system.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
configuration		configuration
datasets		datasets
evaluates		evaluates
log		log
models		models
pipelines		pipelines
result		result
utils		utils
video		video
.DS_Store		.DS_Store
.gitattributes		.gitattributes
20215757_demo.mp4		20215757_demo.mp4
20215757_dissertation.pdf		20215757_dissertation.pdf
README.md		README.md
detection_system.py		detection_system.py
eval_tuber_ava.py		eval_tuber_ava.py
eval_tuber_jhmdb.py		eval_tuber_jhmdb.py
read_result_ava.py		read_result_ava.py
read_result_jhmdb.py		read_result_jhmdb.py
train_tuber_ava.py		train_tuber_ava.py
train_tuber_jhmdb.py		train_tuber_jhmdb.py
visualization_jhmdb.py		visualization_jhmdb.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spatio-temporal Action Detection

Results

Dependency

Dataset and pre-train model

Evaluation

Training

Test(prototype)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spatio-temporal Action Detection

Results

Dependency

Dataset and pre-train model

Evaluation

Training

Test(prototype)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages