Skip to content

Latest commit

 

History

History
46 lines (29 loc) · 3.12 KB

File metadata and controls

46 lines (29 loc) · 3.12 KB

Hardware Aware ViT accelerator

Table of Contents


2023

Title Venue Code Notes
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers HPCA - HeatViT
An Integer-Only and Group-Vector Systolic Accelerator for Efficiently Mapping Vision Transformer on Edge TCAS - -

2022

Title Venue Code Notes
Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization FPL - AUTO-ViT
ViA: A Novel Vision-Transformer Accelerator Based on FPGA TCAD - ViA


Surveys


Notes

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization

This paper is the first FPGA-based ViT Framework. On software part, it employs a mixed-quantization (PoT(mainly on LUT) + Fixed-point(mainly on DSP)) to optimize FPGA resource utilization. On hardware part, the framework is constructed by HLS. To address challenges arising from layer-wise multi-precision, it aligns the output of different quantization schemes, and uses the same ratio of fixed-point to PoT for each head. Parameter tuning is performed by initially fixing Frames Per Second (FPS), then determining the precision and scheme combination, and finally adjusting other parameters based on resource utilization estimates.

ViA: A Novel Vision-Transformer Accelerator Based on FPGA

This paper introduces a novel Vision-Transformer Accelerator designed to address two challenges present in ViT and its variants. One is the path dependency introduced by the shortcut machanism(residual add). The solution involves a half layer mapping technique, leveraging two kinds of hardware design patterns. By incorporating two reuse engines(MSA, MLP) and implementing the streaming pattern within engines. Residual add is relocated from the end of current engine to the start of next engine. The paper also analyzes the data locality of different input dimension of to facilitate parallel computation. It was tested using HLS to generate hardware code and run on a FPGA. To be noted, the data precision used is float16.

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers

Pruned ViT