Skip to content

[ICCV 2025] Offical implementation of "AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model"

License

Notifications You must be signed in to change notification settings

Keio-CSG/AHCPTQ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model

1. Environment Settings

1.1 Create Environment

We follow the environment settings of PTQ4SAM, please refer to the environment.sh in the root directory.

  1. Install PyTorch
conda create -n ahcptq python=3.7 -y
pip install torch torchvision
  1. Install MMCV
pip install -U openmim
mim install "mmcv-full<2.0.0"
  1. Install other requirements
pip install -r requirements.txt
  1. Compile CUDA operators
cd projects/instance_segment_anything/ops
python setup.py build install
cd ../../..
  1. Install mmdet
cd mmdetection/
python3 setup.py build develop
cd ..

1.2 Prepare Dataset

Download the COCO dataset, recollect them as the following form, and revise the corresponding root directory in the code:

├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017

1.3 Download Model Weights

Download the model weights of SAM and detector, save them at the ckpt/ folder:

Model Download
SAM-B Link
SAM-L Link
SAM-H Link
Faster-RCNN Link
YOLOX Link
HDETR Link
DINO Link

2. Run Experiments

Please use the following command to perform AHCPTQ quantization:

python ahcptq/solver/test_quant.py \
--config ./projects/configs/<DETECTOR>/<MODEL.py> \
--q_config ./exp/<QCONFIG>.yaml \
--quant-encoder

Here, <DETECTOR> is the folder name of prompt detector, <MODEL.py> is configuration file of corresponding SAM model, and <QCONFIG>.yaml is the specific quantization configuration file.

For example, to perform W4A4 quantization for SAM-B with a YOLO detector, use the following command:

python ahcptq/solver/test_quant.py \
--config ./projects/configs/yolox/yolo_l-sam-vit-b.py \
--q_config ./exp/config44.yaml \
--quant-encoder

We use A6000 GPU with 48G memory to run these experiments. However, we find that the memory is not sufficient to complete experiments on HDETR and DINO since the number of prompt box is large. Therefore, we offload the memory to CPU DRAM and process quantization one by one.

If you have the same problem, please set keep_gpu: False in the <QCONFIG>.yaml file, and comment out line 218 to 239 in ./ahcptq/solver/recon.py, and unindent line 240 to 279. We hope this could help address this issue.

3. Abstract

Paper Link

The Segment Anything Model (SAM) has demonstrated strong versatility across various visual tasks. However, its large storage requirements and high computational cost pose challenges for practical deployment. Post-training quantization (PTQ) has emerged as an effective strategy for efficient deployment, but we identify two key challenges in SAM that hinder the effectiveness of existing PTQ methods: the heavy-tailed and skewed distribution of post-GELU activations, and significant inter-channel variation in linear projection activations. To address these challenges, we propose AHCPTQ, an accurate and hardware-efficient PTQ method for SAM. AHCPTQ introduces hardware-compatible Hybrid Log-Uniform Quantization (HLUQ) to manage post-GELU activations, employing log2 quantization for dense small values and uniform quantization for sparse large values to enhance quantization resolution. Additionally, AHCPTQ incorporates Channel-Aware Grouping (CAG) to mitigate inter-channel variation by progressively clustering activation channels with similar distributions, enabling them to share quantization parameters and improving hardware efficiency. The combination of HLUQ and CAG not only enhances quantization effectiveness but also ensures compatibility with efficient hardware execution. For instance, under the W4A4 configuration on the SAM-L model, AHCPTQ achieves 36.6% mAP on instance segmentation with the DINO detector, while achieving a $7.89\times$ speedup and $8.64\times$ energy efficiency over its floating-point counterpart in FPGA implementation.

4. Bug in Original PTQ4SAM Framework

We observe that in PTQ4SAM, the dropout probability does not revert to 1.0 during the evaluation phase. As a result, half of the activation values remain in floating-point (not quantized), leading to a significantly overestimated mAP in PTQ4SAM and QDrop, as reported in their paper. To address this issue, we introduce the following code into line 360 in ./ahcptq/solver/test_quant.py to ensure dropout is properly disabled during evaluation.

for n, m in model.named_modules():
    if hasattr(m, 'drop_prob'):
        m.drop_prob = 1

Update on July 9th

We believe the last sentence in recon.py only reset probability of 'post_act_fake_quantize' quantizers and thus cause this issue. We encourage the following research to further fix this error in their experiments.

if isinstance(layer, QuantizeBase) and 'post_act_fake_quantize' in name:
    layer.drop_prob = 1.0

Citation

If you find this repo is useful, please cite our paper. Thanks.

@inproceedings{zhang2025ahcptq,
  title={AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model},
  author={Zhang, Wenlun and Zhong, Yunshan and Ando, Shimpei and Yoshioka, Kentaro},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={22383--22392},
  year={2025}
}

Acknowledgments

Our work is built upon PTQ4SAM. We thank for their pioneering work and create a nice baseline for quanization of SAM.

About

[ICCV 2025] Offical implementation of "AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published