Skip to content

Henrylau127/llama-cpp-sycl-builder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llama-cpp-sycl-builder

Automated build pipeline for running quantised LLMs on a home NAS after ipex-llm discontinued active updates. Attempted GPU-accelerated inference via SYCL on an Intel Pentium Gold 8505 (UHD iGPU, 48 EU) as an alternative to Vulkan.

After benchmarking, CPU inference via Ollama with Vulkan outperformed both SYCL and IPEX-LLM paths on this hardware — the iGPU's shared memory bandwidth and low EU count couldn't overcome the dispatch overhead, while llama.cpp's AVX2 CPU path was already well-optimised for this workload.

Also includes an LLMster Docker image builder: a headless LM Studio server container for serving local inference endpoints on a NAS without a display environment.

Key learnings: GPU backend selection for LLM inference (SYCL vs Vulkan vs CPU), llama.cpp inference stack internals, containerised headless ML serving.

About

Docker Image Build Script for SYCL llama.cpp and llmster (Headless LM Studio daemon)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors