Private AI Cloud

Reproducible infrastructure deployment on Ubuntu Server 24.04 starting from a prepared host with SSH access.

Project Goals

Single-command deployment from a configured base host to a fully operational AI hosting platform.

Single-command reproducibility — sudo bash scripts/deploy.sh from base system to running cluster
Deterministic infrastructure — same input, same output; rebuild > mutate
LXD + kubeadm on a single host — VMs as Kubernetes nodes, no external dependencies
Self-hosted AI services — Ollama, Qdrant, and Open WebUI for LAN users
Documentation-first engineering — ADRs, runbooks, and phase documentation

Getting Started

Prepare your host — Ubuntu Server 24.04 with SSH access (see docs/runbooks/prerequisites.md)
Run preflight checks — sudo bash scripts/preflight.sh
Deploy — sudo bash scripts/deploy.sh
Configure DNS — sudo bash scripts/setup-hosts.sh (see docs/runbooks/prerequisites.md)
Validate — bash tests/smoke/run.sh (see tests/smoke/README.md)

Architecture Overview

Bare Metal → Ubuntu Server → LXD → VMs → Kubernetes → Platform Services → AI Services

Detailed documentation: docs/architecture/system-overview.md

After Deployment

sudo bash scripts/setup-hosts.sh       # configure *.ai.local DNS on infra host
bash tests/smoke/run.sh              # validate the deployment

Smoke tests verify the system as a user would experience it. See tests/smoke/README.md for details.

Default credentials:

Service	Username	Password
Grafana	admin	admin
Prometheus	(no auth)	—
Open WebUI	(register on first visit)	—

Recovery

If a phase fails or you need to reset:

sudo bash scripts/cleanup.sh all       # full reset
sudo bash scripts/cleanup.sh phase-3  # reset through Phase 3, then re-run Phase 3

See docs/runbooks/rebuild-from-scratch.md for details.

Project Status

Current: Deployment pipeline complete. Run smoke tests after deployment to validate.

Planned:

VM resource right-sizing — assign role-appropriate CPU/memory per node
Self-healing infrastructure — golden image snapshots and automated node recovery
Horizontal node scaling — launch additional worker nodes from golden images

See docs/development-plan.md for implementation details.

Repository Structure

Directory	Purpose
`scripts/`	Deployment scripts (orchestrator, per-phase init, cleanup)
`config/`	Configuration (`defaults.yaml`)
`docs/`	Phase documentation, ADRs, runbooks
`tests/smoke/`	Post-deployment validation tests

Architectural Decisions

Key design decisions are documented in docs/decisions/:

ADR	Topic
0001	Project scope and objectives
0002	Directory storage + local-path-provisioner
0003	Single-node first architecture
0004	Routed/NAT mode for LXD bridge
0005	Deterministic infrastructure deployment
0006	VM-based Kubernetes node compilation
0007	Kubeadm init strategy
0008	Phase 5/6 service infrastructure

Intended Audience

Platform engineers
DevOps / SRE practitioners
Infrastructure architects
Homelab enthusiasts
Researchers exploring self-hosted AI systems

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Private AI Cloud

Project Goals

Getting Started

Architecture Overview

After Deployment

Recovery

Project Status

Repository Structure

Architectural Decisions

Intended Audience

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 307 Commits
config		config
docs		docs
scripts		scripts
tests/smoke		tests/smoke
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Private AI Cloud

Project Goals

Getting Started

Architecture Overview

After Deployment

Recovery

Project Status

Repository Structure

Architectural Decisions

Intended Audience

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages