Use llama.cpp instead of ollama

Ollama are just llama.cpp wrapper which adds more complexity that benefits
1. Ollama doesn't have webui, llama.cpp have (don't need separate things like AnythingLLM)
2. Inference are faster with llama.cpp
3. You don't need to use Modelfile bullshit to load ggufs, just load them

Upd: also you may look at llamafile project, just one binary which runs everywhere