A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).
-
Updated
Mar 4, 2025 - Python
A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).
A FlashAttention backwards-over-backwards ⚡🔙🔙
Calculate the hash of any input for ZK-Friendly hashes (MiMC & Poseidon) over a variety of Elliptic Curves.
Variance-stable routing for 2-bit quantized MoE models. Features dynamic phase correction (Armen Guard), syntactic stabilization layer, and recursive residual quantization for efficient inference.
Repo to hold core components when building a Pallas Systems Website
Benchmarking the JAX Pallas implementation of a custom RNN against alternatives
SuperNova (Pasta) proof generator & verifier with CI and frozen fixtures
Add a description, image, and links to the pallas topic page so that developers can more easily learn about it.
To associate your repository with the pallas topic, visit your repo's landing page and select "manage topics."