low-vram

Here are 9 public repositories matching this topic...

QKV-Core / QKV-Core

"Adaptive Hybrid Quantization Framework for deploying 7B+ LLMs on low-VRAM devices (e.g., GTX 1050). Features surgical block alignment and Numba-accelerated inference.

python machine-learning cuda inference transformer attention numba quantization deep-tech llm gguf low-vram qkv

Updated Jan 14, 2026
Python

airesearch-official / Z-Image-Turbo-Windows

Star

One-click Windows installer for Z-Image Turbo AI image generation. Optimized for low-VRAM GPUs (4GB+). Features Gradio web UI, automatic setup, and GGUF model support.

image-generation windows-installer one-click-installer ai-tools stable-diffusion ai-image-generation gguf gradio-ui low-vram z-image-turbo

Updated Dec 13, 2025
PowerShell

Raxephion / AuraGen-AuraFlow-WebUI

Star

Lightweight 6GB VRAM Gradio web app with auto-installer for running AuraFlow locally — no cloud, no clutter.

python open-source image-generation webui gradio text-to-image stable-diffusion diffusers local-inference generative-ai ai-image-generator auraflow low-vram

Updated Jun 7, 2025
Python

Cordux / ComfyUI-Wan2.2-workflow

Star

A ComfyUI Workflow for low vram users

work text-to-image video-generation image-to-video stable-diffusion gguf lowvram comfyu low-vram wan22

Updated Jan 26, 2026

kelvinweijun / wan-2.2-animate-comfyui-kaggle

Star

Contains the notebooks and workflows configured to run inference from Wan 2.2 Animate with ComfyUI on Kaggle T4 GPUs smoothly

notebook ipython-notebook kaggle video-generation kaggle-notebook i2v low-vram wan22

Updated Nov 27, 2025
Jupyter Notebook

Trenaus / LIA-Cognitive-Engine-Showcase

Star

Technical Showcase: 22B True-MoE Engine running on 6GB VRAM (GTX 1060). Demonstrates "Surgical" NF4 quantization, dynamic expert swapping, and the custom "Grace Hopper" pipeline.

research optimization cuda inference moe custom-kernels llm systolic-array low-vram gtx1060

Updated Jan 8, 2026

asmarufoglu / local-genai-forge

Star

A privacy-first Generative AI pipeline for prototyping 3D-style game assets on consumer hardware. Optimized for low-VRAM (4GB) GPUs using PyTorch, Diffusers, and Streamlit.

asset-pipeline game-assets stable-diffusion diffusers generative-ai low-vram

Updated Dec 6, 2025
Python

shanevcantwell / prompt-prix

Star

Audit local LLM function calling and agentic reliability. Visual tool-use benchmarking for quantized models on YOUR hardware.

open-source gradio fan-out multi-gpu ai-safety tool-use model-benchmarking llm local-inference function-calling lm-studio llm-evaluation agentic-ai open-weight low-vram constraint-compliance quantization-testing

Updated Feb 6, 2026
Python

mtmatheuus / QKV-Core

Star

🚀 Run modern 7B LLMs on legacy 4GB GPUs without crashes, breaking the VRAM barrier for developers facing GPU limitations.

python machine-learning cuda inference transformer attention numba quantization deep-tech llm gguf low-vram qkv

Updated Feb 6, 2026
Python

Improve this page

Add a description, image, and links to the low-vram topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the low-vram topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

low-vram

Here are 9 public repositories matching this topic...

QKV-Core / QKV-Core

airesearch-official / Z-Image-Turbo-Windows

Raxephion / AuraGen-AuraFlow-WebUI

Cordux / ComfyUI-Wan2.2-workflow

kelvinweijun / wan-2.2-animate-comfyui-kaggle

Trenaus / LIA-Cognitive-Engine-Showcase

asmarufoglu / local-genai-forge

shanevcantwell / prompt-prix

mtmatheuus / QKV-Core

Improve this page

Add this topic to your repo