Skip to content

bok-bok/AV2T-SAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Visual Segmentation Through Text Embeddings


This is the official repository of the paper "Audio Visual Segmentation Through Text Embeddings"

Requirements


Environment

conda create -n avt python=3.9.20
conda activate avt

pip install -r requirements.txt

Data preparation

1. Download the datasets

2. dataset location configuration

  • Modify path root variables in utils/config_m3.py and utils/config_s4.py.

pre-trained backbones

  • Place all weights from here to pretrained directory.

Training

S4

python train_avs.py --evf_version evf_sam2 --projector_type mul --use_adapter --dataset s4 --batch_size 8

M3

python train_avs.py --evf_version evf_sam2 --projector_type mul --use_adapter --dataset m3 --batch_size 8

Testing

Replace --name and --weight_path to appropriate name andm weight_path.

S4

python test_avs.py --dataset s4 --name name --evf_version evf_sam2 --projector_type mul --use_adapter --adapter_type mul --weight_path weight

M3

python test_avs.py --dataset m3 --name name --evf_version evf_sam2 --projector_type mul --use_adapter --adapter_type mul --weight_path weight

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors