Skip to content

Complete GPT-2 mixed-precision quantization training implementation#2

Draft
Tanayshri123 with Copilot wants to merge 7 commits into
mainfrom
copilot/continue-feature-development
Draft

Complete GPT-2 mixed-precision quantization training implementation#2
Tanayshri123 with Copilot wants to merge 7 commits into
mainfrom
copilot/continue-feature-development

Conversation

Copilot AI commented Dec 25, 2025

Copy link
Copy Markdown

Implements end-to-end training pipeline for GPT-2 with dynamic per-layer quantization using LoRA adapters, enabling parameter-efficient mixed-precision training where bit-widths (2/4/8/32-bit) are randomly assigned per batch.

Core Implementation

train.py - Training orchestration:

  • Injects SwitchableLinear layers into GPT-2 architecture
  • Applies QuantizationController to randomize layer bit-widths each batch
  • Trains LoRA adapters while keeping base weights frozen
  • ~1% trainable parameters vs full fine-tuning

models/layers.py - Enhanced SwitchableLinear:

class SwitchableLinear(nn.Module):
    def __init__(self, base_linear_layer, supported_bits=[2, 4, 8], lora_rank=4):
        # ...
        self._current_bit_width = 32  # Stored state for seamless forward pass
        
    def forward(self, x, bit_width=None):
        if bit_width is None:
            bit_width = self._current_bit_width  # Falls back to stored value
        # Quantize and apply LoRA adapter for current bit-width

Controller sets _current_bit_width before each forward pass, eliminating need for monkey-patching or signature changes in GPT-2's call stack.

Infrastructure

  • requirements.txt - Dependencies with CVE fixes (torch 2.0→2.6, transformers 4.30→4.48)
  • .gitignore - Standard ML project exclusions
  • README.md - Architecture overview, usage, configuration

Security

  • Fixed heap buffer overflow in PyTorch <2.2.0
  • Fixed deserialization RCE in transformers <4.48.0
  • CodeQL: 0 alerts
Original prompt

continue

Created from VS Code via the GitHub Pull Request extension.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 6 commits December 25, 2025 21:31
Co-authored-by: Tanayshri123 <43706966+Tanayshri123@users.noreply.github.com>
…ining loop

Co-authored-by: Tanayshri123 <43706966+Tanayshri123@users.noreply.github.com>
Co-authored-by: Tanayshri123 <43706966+Tanayshri123@users.noreply.github.com>
Co-authored-by: Tanayshri123 <43706966+Tanayshri123@users.noreply.github.com>
Co-authored-by: Tanayshri123 <43706966+Tanayshri123@users.noreply.github.com>
Co-authored-by: Tanayshri123 <43706966+Tanayshri123@users.noreply.github.com>
Copilot AI changed the title [WIP] Continue feature development workflow Complete GPT-2 mixed-precision quantization training implementation Dec 25, 2025
Copilot AI requested a review from Tanayshri123 December 25, 2025 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants