Audio Visual Segmentation Through Text Embeddings

This is the official repository of the paper "Audio Visual Segmentation Through Text Embeddings"

Requirements

Environment

conda create -n avt python=3.9.20
conda activate avt

pip install -r requirements.txt

Data preparation

1. Download the datasets

Follow the guidance of https://github.com/OpenNLPLab/AVSBench to download the AVSBench dataset.

2. dataset location configuration

Modify path root variables in utils/config_m3.py and utils/config_s4.py.

pre-trained backbones

Place all weights from here to pretrained directory.

Training

S4

python train_avs.py --evf_version evf_sam2 --projector_type mul --use_adapter --dataset s4 --batch_size 8

M3

python train_avs.py --evf_version evf_sam2 --projector_type mul --use_adapter --dataset m3 --batch_size 8

Testing

Replace --name and --weight_path to appropriate name andm weight_path.

S4

python test_avs.py --dataset s4 --name name --evf_version evf_sam2 --projector_type mul --use_adapter --adapter_type mul --weight_path weight

M3

python test_avs.py --dataset m3 --name name --evf_version evf_sam2 --projector_type mul --use_adapter --adapter_type mul --weight_path weight

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
CLAP		CLAP
model		model
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
test_avs.py		test_avs.py
train_avs.py		train_avs.py
utility.py		utility.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Visual Segmentation Through Text Embeddings

Requirements

Environment

Data preparation

1. Download the datasets

2. dataset location configuration

pre-trained backbones

Training

S4

M3

Testing

S4

M3

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Visual Segmentation Through Text Embeddings

Requirements

Environment

Data preparation

1. Download the datasets

2. dataset location configuration

pre-trained backbones

Training

S4

M3

Testing

S4

M3

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages