Silo

private on-device ai chat for ios — llama.cpp, metal gpu, gguf models, no cloud

📱 Demo

🚀 Quick Start

git clone https://github.com/stevederico/silo.git
cd silo
cd /tmp && git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp && ./build-xcframework.sh
cp -R build-apple/llama.xcframework <path-to-silo>/
open Silo.xcodeproj

Build and run on an iPhone or simulator running iOS 18.2+.

✨ Features

🔒 Privacy

Zero network requests — inference runs entirely on device
No accounts — no login, no email, no signup
No tracking — no analytics, no telemetry, no data collection
Works offline — airplane mode, subway, off-grid

🧠 Models

Gemma 4 E2B from Google (Q4 default, Q8 optional)
Ministral 3B from Mistral — balanced instruct model
LFM 2.5 from Liquid AI — fast 1.2B instruct model
Bring your own GGUF from any Hugging Face URL

💬 Chat

Streaming markdown with bold, headers, code blocks, nested lists
Conversation history stored locally, never uploaded
Uncensored — the model answers, the app does not filter
System prompt editor for custom behavior

⚡ Performance

Metal GPU acceleration via llama.cpp
BF16 compute on supported hardware
Actor-isolated inference — thread-safe, single model loaded at a time
Download resume and corrupt file detection

🧩 Architecture

Swift actor wraps llama.cpp for thread-safe inference. One model loaded at a time. Backend ref counting prevents double-init. Model-specific chat templates fall back to llama_chat_apply_template() then ChatML. Streaming output passes through SpecialTokenFilter and ThinkTagStripper before rendering in a memoized block-level markdown view.

Silo/
├── Inference/     # LibLlama actor, engine protocol, format detector
├── UI/            # ContentView, LlamaState, ConversationManager
│   └── Components # MessageBubble, MarkdownText, DrawerView, HeaderView
└── SiloApp.swift

🛠️ Tech Stack

Technology	Version	Purpose
Swift / SwiftUI	5.9+	UI and app lifecycle
llama.cpp	latest	GGUF inference engine
Metal	—	GPU acceleration
iOS	18.2+	Minimum deployment target

🧪 Build

xcodebuild -project Silo.xcodeproj -scheme Silo \
  -destination 'platform=iOS Simulator,name=iPhone 17 Pro' build

llama.xcframework is gitignored. Rebuild from source when adding support for new model architectures.

🤝 Contributing

git clone https://github.com/stevederico/silo.git
cd silo
open Silo.xcodeproj

Open an issue before large changes. Keep PRs scoped. Set your own DEVELOPMENT_TEAM and PRODUCT_BUNDLE_IDENTIFIER before building on a device.

🌍 Community

X: @stevederico
Issues: github.com/stevederico/silo/issues

🙏 Acknowledgements

llama.cpp — inference engine
Unsloth — quantized model releases
Hugging Face — model hosting

📄 License

MIT License

Built with Swift, llama.cpp, and Metal.
⭐ Star this repo if Silo keeps your conversations private.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Silo.xcodeproj		Silo.xcodeproj
Silo		Silo
llama.xcframework		llama.xcframework
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
demo.png		demo.png
example.gif		example.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Silo

private on-device ai chat for ios — llama.cpp, metal gpu, gguf models, no cloud

📱 Demo

🚀 Quick Start

✨ Features

🔒 Privacy

🧠 Models

💬 Chat

⚡ Performance

🧩 Architecture

🛠️ Tech Stack

🧪 Build

🤝 Contributing

🌍 Community

🙏 Acknowledgements

📄 License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Silo

private on-device ai chat for ios — llama.cpp, metal gpu, gguf models, no cloud

📱 Demo

🚀 Quick Start

✨ Features

🔒 Privacy

🧠 Models

💬 Chat

⚡ Performance

🧩 Architecture

🛠️ Tech Stack

🧪 Build

🤝 Contributing

🌍 Community

🙏 Acknowledgements

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages