Skip to content

Code-Studio-AI-Research-Lab/Zero-Emission-Transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Towards Zero-Emission AI: Ultra-Lightweight Transformer Architectures using Dynamic Sparsification for Edge Devices

License: MIT Framework: PyTorch Domain: Green AI

This repository contains the official implementation and research framework for developing Ultra-Lightweight Transformer Architectures. The primary goal of this project is to drastically reduce the carbon footprint, memory usage, and energy consumption of Large Language Models (LLMs) via Dynamic Sparsification, making them fully deployable on resource-constrained edge devices (e.g., smartphones, IoT nodes) without compromising downstream task performance.


📌 Research Abstract & Core Concept

Modern Transformer-based architectures deliver state-of-the-art results but come with a massive computational and environmental cost. This project introduces a novel neural network pruning and structural optimization framework:

  • Dynamic Sparsification: Evaluates token and attention-head importance in real-time during inference, dynamically skipping redundant matrix computations.
  • Hardware-Aware Optimization: Tailors the sparse computational graph specifically to compile efficiently on mobile CPUs and Edge NPUs (Neural Processing Units).
  • Green AI Evaluation: Quantifies success not just by Accuracy/F1-Score, but by tracking Energy Consumption (Joules) and Carbon Footprint ($CO_2$ emissions).

🛠️ Key Features & Methodology

  1. Dynamic Attention Pruning: A custom attention layer that dynamically zeroes out low-weight attention scores on-the-fly.
  2. Weight Quantization: Mixed-precision training (FP16 to INT8/INT4 transformation) optimized for edge deployment.
  3. Comprehensive Benchmarking: Direct comparison against standard baselines (e.g., BERT-mini, MobileBERT, TinyLLaMA) across GLUE and SuperGLUE benchmarks.
  4. Telemetry Tools: Built-in integration with CodeCarbon to measure absolute power draw during training and inference.

📂 Repository Structure

├── src/
│   ├── models/             # Custom Dynamic Sparse Transformer architectures
│   ├── training/           # Pruning-aware training and fine-tuning pipelines
│   ├── quantization/       # Quantization scripts for edge deployment (ONNX/TFLite)
│   └── utils/              # Telemetry, carbon tracking, and data loaders
├── data/                   # GLUE benchmark processing scripts
├── benchmarks/             # Scripted evaluations for latency, memory, and energy
├── notebooks/              # Exploratory analysis and structural pruning visualization
├── Literature_Review/      # Research matrix and BibTeX files of reference papers
└── README.md

About

Research on green, sustainable AI: Architecting energy-efficient Transformers via real-time pruning and dynamic sparsification for low-power edge computing.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors