Skip to content

kolohelios/afmbridge

Repository files navigation

AFMBridge

Apple Foundation Models Bridge - OpenAI and Anthropic compatible REST API for Apple's FoundationModels framework.

CI License: MIT

Overview

AFMBridge is a standalone Swift/Vapor REST API server that wraps Apple's FoundationModels framework (macOS 26.0+) with industry-standard LLM APIs, enabling seamless integration with existing OpenAI and Anthropic client libraries.

Status: ✅ Phase 5 - Production Ready (Complete)

Features

Phase 1 (Complete)

  • ✅ OpenAI Chat Completions API compatibility (/v1/chat/completions)
  • ✅ Non-streaming responses
  • ✅ System message support
  • ✅ Environment-based configuration
  • ✅ Comprehensive test coverage (49 tests, 100% passing)
  • ✅ Integration tests with Vapor
  • ✅ Health check endpoint

Phase 2 (Complete)

  • ✅ Server-Sent Events (SSE) streaming for real-time responses
  • ✅ True token-by-token streaming via FoundationModels AsyncSequence
  • ✅ OpenAI-compatible streaming format with delta chunks
  • ✅ Lower time-to-first-token for better UX

Phase 3 (Complete)

  • ✅ OpenAI-compatible tool calling with AFM's native Tool protocol
  • ✅ Tool definition schema using JSON Schema
  • ✅ Multi-turn conversation with client-side tool execution
  • ✅ Streaming DTOs for tool calls (automatic fallback to non-streaming)
  • ✅ Complete test coverage (100 tests, 100% passing)

Phase 4 (Complete)

  • ✅ Anthropic Messages API compatibility (/v1/messages)
  • ✅ Non-streaming message responses
  • ✅ Server-Sent Events (SSE) streaming with Anthropic format
  • ✅ System parameter support
  • ✅ Content blocks support
  • ✅ Anthropic-compatible tool calling

Phase 5 (Complete)

  • ✅ API key authentication (Bearer token)
  • ✅ Error middleware with formatted error responses
  • ✅ Request logging and metrics (MetricsMiddleware)
  • ✅ 80% code coverage (239 tests passing, 81.61% coverage)
  • ✅ Production documentation and deployment guide

Infrastructure

  • ✅ Reproducible builds with Nix flakes
  • ✅ Structured logging
  • ✅ Automated CI/CD with GitHub Actions
  • ✅ Code-signed and notarized macOS releases
  • ✅ Automated binary distribution via GitHub Releases

Planned

  • No additional features planned - project is feature-complete!

Installation

Pre-built Binary (Recommended)

Download the latest signed and notarized macOS binary from GitHub Releases:

# Download and extract (replace VERSION with latest from releases page)
VERSION=v1.0.0  # Check releases page for latest version
curl -L -o afmbridge.tar.gz \
  https://github.com/kolohelios/afmbridge/releases/download/${VERSION}/afmbridge-macos-${VERSION}.tar.gz
tar -xzf afmbridge.tar.gz

# Run the server
./AFMBridge serve

The binary is code-signed with a Developer ID Application certificate and notarized by Apple, so macOS will trust it without security warnings.

Build from Source

See Quick Start below for building with Nix.

Deployment

For production deployment, code signing setup, and release process, see DEPLOYMENT.md.

Requirements

  • macOS 26.0+ (for FoundationModels framework)
  • Apple Silicon (M-series chips)
  • Swift 6.0+ (for source builds)
  • Nix with flakes enabled (optional, for reproducible builds)

About Apple FoundationModels

This project wraps Apple's FoundationModels framework, which provides on-device LLM inference for macOS 26.0+ (Tahoe) with Apple Intelligence.

Key capabilities:

  • On-device inference with privacy protection (data never leaves your Mac)
  • Works offline once models are downloaded
  • Native Swift API with async/await support
  • Streaming responses via AsyncSequence
  • Free inference (no API costs)

API Documentation:

Quick Start

Using Nix (Recommended)

Nix provides consistent development tool versions while using system Swift for compilation:

# Clone the repository
git clone https://github.com/kolohelios/afmbridge.git
cd afmbridge

# Install Swift development tools via Homebrew
brew install swift-format swiftlint

# Enter Nix development environment
nix develop

# Run the server
just run

# Or run with custom config
HOST=0.0.0.0 PORT=8080 just run

How it works:

  • Nix provides: just, markdownlint, Python SDKs (openai, anthropic)
  • Homebrew provides: swift-format, swiftlint (avoid SDK conflicts)
  • System provides: Swift 6.2.3 compiler (required for FoundationModels)

Without Nix

# Clone the repository
git clone https://github.com/kolohelios/afmbridge.git
cd afmbridge

# Install all development tools via Homebrew
brew install just swift-format swiftlint

# Build and run
just build
just run

Development

Prerequisites

With Nix (Recommended)

  • Nix with flakes enabled (install)
  • Homebrew for Swift tools (swift-format, swiftlint)
  • Swift 6.0+ - Included with Xcode on macOS 26+
  • direnv - Automatic environment loading (optional)

Nix provides consistent versions of just, markdownlint, and Python SDKs. Homebrew provides Swift-specific tools to avoid SDK conflicts.

Without Nix

  • Swift 6.0+ - Included with Xcode on macOS 26+
  • Homebrew for all dev tools: brew install just swift-format swiftlint

Setup

# Allow direnv (if using)
direnv allow

# Or manually enter development shell
nix develop

# Verify setup
just --list

Development Commands

All development tasks are managed through just:

just format          # Auto-format Swift code and markdown docs
just lint            # Run SwiftLint and markdownlint
just test            # Run all tests with coverage
just build           # Build the project
just validate        # Run all quality checks (format + lint + test + build)
just clean           # Clean build artifacts

Code Quality Standards

This project maintains high code quality through:

  • SwiftLint - Swift code linting (120 char line length, max 40 line functions)
  • swift-format - Consistent Swift code formatting (100 char wrapping, 4 space indent)
  • markdownlint - Documentation linting (120 char line length)
  • Test Coverage - Minimum 80% code coverage requirement
  • Conventional Commits - All commits follow conventional commit format (max 70 chars)
  • Atomic Commits - Each commit is self-contained and passes validation

Workflow

  1. Make changes in your working directory
  2. Run just validate to ensure all quality checks pass
  3. Commit with conventional commit message
  4. Push and create pull request

See CONTRIBUTING.md for detailed guidelines and AGENTS.md for AI agent collaboration standards.

Configuration

Configure the server with environment variables:

HOST=0.0.0.0              # Bind address (default: 127.0.0.1)
PORT=8080                 # Port number (default: 8080)
MAX_TOKENS=1024           # Max tokens per request (default: 1024)
LOG_LEVEL=info            # Log level: trace, debug, info, warning, error (default: info)
API_KEY=your-secret-key   # Optional: Enable Bearer token authentication (default: disabled)

Authentication

API key authentication is disabled by default. To enable it, set the API_KEY environment variable:

API_KEY=your-secret-key just run

When enabled, all API requests must include a Bearer token in the Authorization header:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer your-secret-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

If authentication fails, the server returns a 401 Unauthorized error with the appropriate error format (OpenAI or Anthropic depending on the endpoint).

API Usage

Health Check

curl http://localhost:8080/health
# Returns: OK

OpenAI Compatible Endpoint

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

With system message:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ]
  }'

Response format:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "apple-afm-on-device",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ]
}

Streaming Support (Phase 2 - Complete)

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Write a haiku"}],
    "stream": true
  }'

Returns Server-Sent Events with true token-by-token streaming using Apple's native LanguageModelSession.streamResponse() API.

Tool Calling Support (Phase 3 - Complete)

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "What is the weather in Boston?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string"}
          },
          "required": ["location"]
        }
      }
    }]
  }'

Returns tool calls with finish_reason: "tool_calls". Client executes tools and submits results in a follow-up request. See API.md for complete tool calling documentation.

Anthropic Compatible Endpoint (Phase 4 - Complete)

Basic message:

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

With system parameter:

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 1024,
    "system": "You are a helpful assistant.",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Response format:

{
  "id": "msg-...",
  "type": "message",
  "role": "assistant",
  "model": "apple-afm-on-device",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I assist you today?"
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 12
  }
}

Streaming support:

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Write a haiku"}],
    "stream": true
  }'

Returns Server-Sent Events with Anthropic's 6-event streaming format:

  1. message_start - Message metadata with input token count
  2. content_block_start - Start of text content block
  3. content_block_delta - Streaming text deltas (multiple events)
  4. content_block_stop - End of content block
  5. message_delta - Final message metadata with stop reason
  6. message_stop - Stream completion

Tool calling support:

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-5-20251101",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "What is the weather in Boston?"}],
    "tools": [{
      "name": "get_weather",
      "description": "Get weather for a location",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {"type": "string"}
        },
        "required": ["location"]
      }
    }]
  }'

Returns tool_use content blocks with stop_reason: "tool_use". Client executes tools and submits results as tool_result blocks in a follow-up request.

Architecture

Built with:

  • Swift 6.0 - Modern, safe, and fast
  • Vapor 4.x - Web framework for server implementation
  • Nix Flakes - Reproducible development and deployment
  • Just - Command runner for development tasks
  • Jujutsu (jj) - Version control with native PR stacking

See PLAN.md for the complete implementation roadmap.

Project Structure

afmbridge/
├── Sources/
│   ├── App/              # Application entry point
│   ├── Controllers/      # HTTP request handlers
│   ├── DTOs/             # Data Transfer Objects (OpenAI & Anthropic)
│   ├── Services/         # Business logic and LLM integration
│   ├── Models/           # Domain models and errors
│   ├── Middleware/       # Request/response processing
│   └── Configuration/    # Server configuration
├── Tests/                # Unit and integration tests
├── .github/workflows/    # CI/CD pipelines
├── Package.swift         # Swift package manifest
├── flake.nix             # Nix flake for reproducible builds
└── Justfile              # Development task runner

Roadmap

  • Phase 0: Project Foundation (Complete)
    • Nix build system
    • Development tooling (just, SwiftLint, swift-format)
    • CI/CD pipelines
    • Documentation and standards
  • Phase 1: MVP - Non-streaming OpenAI API (Complete)
    • OpenAI DTOs (request/response)
    • FoundationModelService (AFM wrapper)
    • MessageTranslationService (OpenAI to AFM)
    • OpenAIController (HTTP endpoint)
    • ServerConfig (environment variables)
    • Integration tests and documentation
  • Phase 2: Streaming Support (Complete)
    • Server-Sent Events (SSE) implementation
    • Streaming DTOs and chunked responses
    • True token-by-token streaming via AFM AsyncSequence
    • Streaming integration tests
  • Phase 3: Tool Calling Support (Complete)
    • OpenAI-compatible tool calling DTOs
    • Tool definition schema with JSON Schema
    • Multi-turn conversation with tool results
    • Client-side tool execution pattern
    • Comprehensive tool calling tests (100 total tests)
  • Phase 4: Anthropic API Support (Complete)
    • Anthropic Messages API DTOs
    • Non-streaming message support
    • Server-Sent Events streaming with Anthropic format
    • System parameter and content blocks
    • Integration tests for Anthropic API
    • Error middleware with formatted responses
    • Anthropic-compatible tool calling
  • Phase 5: Production Ready (Complete)
    • API key authentication (Bearer token)
    • Error middleware with formatted error responses
    • Request logging and metrics (MetricsMiddleware)
    • 80% code coverage (239 tests passing, 81.61% coverage)
    • Production documentation and deployment guide

See PLAN.md for detailed phase breakdown.

Contributing

Contributions are welcome! Please read CONTRIBUTING.md for guidelines.

For AI agents working on this project, see AGENTS.md for collaboration standards.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built with Vapor web framework
  • Reproducible builds powered by Nix
  • Task automation with just
  • Version control with jujutsu

About

afmbridge is a Apple Foundation Model bridge that allows AFM to be used through standard vendor SDKs

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages