LSH-Based Efficient Point Transformer (HEPT)

This HEPT Fork uses PQuant for pruning and quantization

Script for pruning and quantization is uploaded as src/tracking_quantizer_and_pruner.py

There are 2 existing PyTorch model files present under data/tracking/logs/:

Note: These are old results

Quantized model: Quantized using Fixed point representation (1 sign bit, 7 integer bits, 8 fractional bits), accuracy on tracking-600 is ~87%
Pre trained model: Non quantized version with accuracy on tracking-600 as ~89%, regions = 1

Datasets

The quantization script has only been tested with tracking-600 dataset. Copy the tracking-600 dataset to the folder data/tracking/processed/

Installation

Environment

We are using torch 2.3.1 and pyg 2.5.3 with python 3.10.14 and cuda 12.1. Use the following command to install the required packages:

conda env create -f pquant_hept_env.yaml
pip install torch_geometric==2.5.3
pip install torch_scatter==2.1.2 torch_cluster==1.6.3 -f https://data.pyg.org/whl/torch-2.3.0+cu121.html
pip install --no-deps git+https://github.com/calad0i/HGQ2.git

Download PQuant manually, git clone https://github.com/cern-nextgen/PQuantML

Then, go to src/pquant/core/torch/layers.py and comment out the line: model(torch.rand(input_shape).to("cuda"))

Now, install PQuant using pip install <Path to PQuant>

Running the code

Before running the code, change the "PATH" variable in tracking_quantizer_and_pruner.py to the folder path that you want the logs to go in.

For running the pruning / quantization script

python tracking_quantizer_and_pruner.py

Configurations will be loaded from those located in ./configs/ directory.

TODO

Put more details in the README.
Add support for FlashAttn.
Add support for efficient processing of batched input.
Add an example of HEPT with minimal code.

News

2024.06: HEPT has been accepted to ICML 2024 and is selected as an oral presentation (144/9473, 1.5% acceptance rate)!
2024.04: HEPT now supports efficient processing of batched input by this commit. This is implemented via integrating batch indices in the computation of AND hash codes, which is more efficient than naive padding, especially for batches with imbalanced point cloud sizes. Note:
- Only the code in ./example is updated to support batched input, and the original implementation in ./src is not updated.
- The current implementation for batched input is not yet fully tested. Please feel free to open an issue if you encounter any problems.
2024.04: An example of HEPT with minimal code is added in ./example by this commit. It's a good starting point for users who want to use HEPT in their own projects. There are minor differences between the example and the original implementation in ./src/models/attention/hept.py, but they should not affect the performance of the model.

Introduction

This study introduces a novel transformer model optimized for large-scale point cloud processing in scientific domains such as high-energy physics (HEP) and astrophysics. Addressing the limitations of graph neural networks and standard transformers, our model integrates local inductive bias and achieves near-linear complexity with hardware-friendly regular operations. One contribution of this work is the quantitative analysis of the error-complexity tradeoff of various sparsification techniques for building efficient transformers. Our findings highlight the superiority of using locality-sensitive hashing (LSH), especially OR & AND-construction LSH, in kernel approximation for large-scale point cloud data with local inductive bias. Based on this finding, we propose LSH-based Efficient Point Transformer (HEPT), which combines E2LSH with OR & AND constructions and is built upon regular computations. HEPT demonstrates remarkable performance in two critical yet time-consuming HEP tasks, significantly outperforming existing GNNs and transformers in accuracy and computational speed, marking a significant advancement in geometric deep learning and large-scale scientific data processing.

Figure 1.Pipline of HEPT.

## FAQ

How to tune the hyperparameters of HEPT?

There are three key hyperparameters in HEPT:

block_size: block size for attention computation
n_hashes: the number of hash tables, i.e., OR LSH
num_regions: # of regions HEPT will randomly divide the input space into (Sec. 4.3 in the paper)

We suggest first determine block_size and n_hashes according to the computational budget, but generally n_hashes should be greater than 1. num_regions should be tuned according to the local inductive bias of the dataset.

Reference

@article{miao2024locality,
  title   = {Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics},
  author  = {Miao, Siqi and Lu, Zhiyuan and Liu, Mia and Duarte, Javier and Li, Pan},
  journal = {International Conference on Machine Learning},
  year    = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
data		data
example		example
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pquant_hept_env.yaml		pquant_hept_env.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LSH-Based Efficient Point Transformer (HEPT)

This HEPT Fork uses PQuant for pruning and quantization

Datasets

Installation

Environment

Running the code

TODO

News

Introduction

How to tune the hyperparameters of HEPT?

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LSH-Based Efficient Point Transformer (HEPT)

This HEPT Fork uses PQuant for pruning and quantization

Datasets

Installation

Environment

Running the code

TODO

News

Introduction

How to tune the hyperparameters of HEPT?

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages