Skip to content
View jajmangold's full-sized avatar

Block or report jajmangold

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. vllm-sm70 vllm-sm70 Public archive

    vLLM for nVidia Volta (sm_70)

    Dockerfile 8 3

  2. llama.cpp llama.cpp Public

    Forked from ggml-org/llama.cpp

    LLM inference in C/C++

    C++

  3. TurboQuant-Vulkan TurboQuant-Vulkan Public

    Forked from tsuyu122/TurboQuant-Vulkan

    TurboQuant Vulkan: 3-bit KV cache quantization for llama.cpp using Lloyd-Max Gaussian codebooks. 4.57x compression, Vulkan GPU support (AMD/Intel/NVIDIA). Hobby project.

    C++

  4. test-task-tracker test-task-tracker Public

    Task Tracker CLI with Neo4j storage

    Python

  5. vllm-omni vllm-omni Public

    Forked from vllm-project/vllm-omni

    A framework for efficient model inference with omni-modality models

    Python

  6. 1Cat-vLLM 1Cat-vLLM Public

    Forked from 1CatAI/1Cat-vLLM

    vLLM fork for Tesla V100 (SM70) with AWQ 4-bit support, CUDA 12.8 build flow, and validated Qwen3.5 27B/35B deployment on multi-GPU V100.

    Python