Ampere LLM Orchestrator

Overview

The Ampere LLM Orchestrator is a high-performance, high-density platform designed for orchestrating and monitoring multiple Large Language Model (LLM) instances on AmpereOne® M CPUs.

By leveraging the high core-count architecture of Ampere processors, this application achieves extreme throughput and density, running multiple independent llama.cpp servers concurrently—each isolated and pinned to specific 32-core segments for deterministic performance.

Key Features

High-Density Orchestration: Deploy and manage 4 independent llama.cpp nodes simultaneously, each with its own dedicated 32-core resources.
Domain-Specific AI Experts:
- Legal & Compliance: Regulatory and corporate law specialist.
- Cybersecurity: Threat intelligence and network security architect.
- Fintech & Finance: Algorithmic trading and financial regulation expert.
- Supply Chain & Ops: Global logistics and operational resilience specialist.
Modern Startup UI: Sleek dark-mode dashboard with real-time performance visualization.
Real-Time Performance Metrics:
- Cluster Throughput (TPS): Aggregate tokens per second across all 20 active workers.
- Peak Throughput: Tracking session-high performance levels.
- CPU Load Monitoring: Real-time utilization stats per container, normalized to the assigned 32-core cpuset.
Deterministic Compute: Uses cpuset pinning to ensure no resource contention between AI domains.

Architecture

The system consists of a modern full-stack architecture optimized for high-density inference:

Frontend: React (Vite) + Tailwind CSS + Lucide Icons featuring a glassmorphism "AI Startup" aesthetic.
Backend: Node.js (Express) serving as an intelligent API proxy and performance monitor.
Inference Nodes: 4x llama.cpp containers running in Docker, each serving specialized GGUF models.

Getting Started

Prerequisites

Docker and Docker Compose
Ampere-based Cloud Instance (OCI A1, AWS M6g/C6g/R6g, GCP T2A, etc.)
Node.js 20+

Deployment with Docker

Clone the repository.
Configure your environment: Copy .env.example to .env and adjust the model parameters.
```
cp .env.example .env
```
Launch the Orchestrator:
```
docker-compose up -d
```
Access the Dashboard: Navigate to http://localhost:3000 to start the cluster orchestration.

Configuration

The orchestration is highly configurable via environment variables:

Variable	Description
`HF_REPO_X`	Hugging Face repository for Domain Expert X (1-4).
`HF_FILE_X`	Specific GGUF model file for Domain Expert X (1-4).
`LLAMA_ARG_THREADS`	Number of CPU threads (default 32) per instance.
`cpuset`	Core pinning (e.g., 0-31, 32-63) defined in `docker-compose.yml`.

Why Ampere?

AmpereOne® M CPUs provide a deterministic, high-core-count environment ideal for LLM density. By avoiding the overhead of multi-threading contention and leveraging large core counts, this orchestrator demonstrates how a single Ampere server can replace expensive GPU-based infrastructure for high-throughput, low-latency inference workloads.

Built for High-Efficiency AI on Ampere Computing.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
index.html		index.html
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
server.ts		server.ts
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ampere LLM Orchestrator

Overview

Key Features

Architecture

Getting Started

Prerequisites

Deployment with Docker

Configuration

Why Ampere?

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ampere LLM Orchestrator

Overview

Key Features

Architecture

Getting Started

Prerequisites

Deployment with Docker

Configuration

Why Ampere?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages