Skip to content

mrmoe28/mac-mini-llm-setup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Mac Mini + Windows PC LLM Server Setup

Turn your Mac Mini into a lean coding machine powered by a Windows PC LLM server

Perfect setup for developers who want AI assistance without slowing down their Mac.


πŸ“‹ Overview

This repository contains everything you need to set up a dual-machine AI coding assistant:

  • Mac Mini (8GB): Fast coding interface, stays responsive
  • Windows PC (32GB): Heavy LLM processing, runs large models
    Mac Mini (Client)              Windows PC (Server)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ’» VS Code/Cursor   β”‚ ◄────►│  🧠 LM Studio       β”‚
β”‚  ⌨️  Terminal        β”‚  LAN  β”‚  πŸ€– 32B+ Models     β”‚
β”‚  πŸ“ Code Helper      β”‚       β”‚  ⚑ 32GB RAM        β”‚
β”‚  Fast & Responsive   β”‚       β”‚  Heavy Processing   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Benefits:

  • βœ… Mac Mini never slows down (no models loaded locally)
  • βœ… Run huge models (32B, 70B) on Windows PC
  • βœ… Better AI quality (larger models)
  • βœ… All 8GB Mac RAM free for coding
  • βœ… Simple scripts for easy use

⚑ Quick Start

1. Windows PC Setup (5 minutes)

Go to Windows PC and follow: windows-server/WINDOWS_SERVER_SETUP.md

TL;DR:

# Download LM Studio from https://lmstudio.ai
# Open LM Studio β†’ Local Server β†’ Start Server
# Note your IP address: ipconfig

2. Mac Mini Setup (2 minutes)

# Clone this repo (or download scripts)
git clone https://github.com/YOUR_USERNAME/mac-mini-llm-setup.git
cd mac-mini-llm-setup

# Make scripts executable
chmod +x scripts/*.sh

# Copy scripts to home directory
cp scripts/*.sh ~/
cp configs/ollama_gpu_config.example ~/.ollama_gpu_config

# Connect to Windows server
~/connect-to-llm-server.sh
# Enter Windows PC IP when prompted

3. Start Coding!

# Interactive chat
~/chat-with-server.sh

# Code assistance
~/code-helper.sh

# VS Code integration
# Install "Continue" extension
# Configure with docs/SETUP_GUIDE.md

πŸ“ Repository Structure

mac-mini-llm-setup/
β”œβ”€β”€ README.md                       # This file
β”œβ”€β”€ docs/                           # Documentation
β”‚   β”œβ”€β”€ SETUP_GUIDE.md              # Detailed setup instructions
β”‚   β”œβ”€β”€ LLM_SERVER_ARCHITECTURE.md  # Architecture overview
β”‚   β”œβ”€β”€ GPU_OPTIMIZATION_GUIDE.md   # Local GPU optimization (optional)
β”‚   β”œβ”€β”€ GPU_TEST_RESULTS.md         # Performance benchmarks
β”‚   └── FINAL_SUMMARY.md            # Complete summary
β”œβ”€β”€ scripts/                        # Mac Mini client scripts
β”‚   β”œβ”€β”€ connect-to-llm-server.sh    # Connect to Windows server
β”‚   β”œβ”€β”€ chat-with-server.sh         # Interactive AI chat
β”‚   β”œβ”€β”€ code-helper.sh              # Code assistance tool
β”‚   β”œβ”€β”€ optimize_gpu.sh             # Local GPU optimization
β”‚   └── gpu_monitor.sh              # GPU monitoring tool
β”œβ”€β”€ windows-server/                 # Windows PC server setup
β”‚   └── WINDOWS_SERVER_SETUP.md     # Complete Windows setup guide
└── configs/                        # Configuration examples
    └── ollama_gpu_config.example   # GPU config template

🎯 Use Cases

1. Code Assistance While Coding

# In terminal:
~/code-helper.sh

# Choose:
# 1) Explain code
# 2) Review code
# 3) Generate code
# 4) Fix bugs
# 5) Optimize code
# 6) Generate commit message
# 7) Custom question

2. Interactive AI Chat

~/chat-with-server.sh

# Chat with AI powered by 32B model on Windows PC
# Ask questions, debug issues, brainstorm ideas

3. VS Code Integration

Install "Continue" extension and configure:

{
  "models": [{
    "title": "Windows Server - Qwen 32B",
    "provider": "openai",
    "model": "qwen2.5-coder:32b",
    "apiBase": "http://YOUR_WINDOWS_IP:1234/v1",
    "apiKey": "sk-dummy"
  }]
}

Then use Cmd+L in VS Code for instant AI assistance!


πŸ”§ Features

Mac Mini Scripts:

  • βœ… connect-to-llm-server.sh - Auto-configure connection to Windows PC
  • βœ… chat-with-server.sh - Interactive chat interface
  • βœ… code-helper.sh - 7 code assistance modes
  • βœ… optimize_gpu.sh - Local GPU optimization (optional)
  • βœ… gpu_monitor.sh - Real-time GPU monitoring

Windows Server:

  • βœ… LM Studio support - Easy GUI setup
  • βœ… Ollama support - Command-line option
  • βœ… 32B+ model support - Run huge models
  • βœ… Auto-start configuration - Start on boot
  • βœ… Firewall configuration - One-command setup

πŸ“Š Performance

Before (Mac Mini Local):

  • Model size: Max 7B
  • Speed: 20-30 tok/sec
  • RAM usage: High (6-7GB)
  • Mac responsiveness: Slows down

After (Windows Server):

  • Model size: Up to 70B+
  • Speed: 40-60 tok/sec (32B model)
  • Mac RAM usage: Low (2-3GB)
  • Mac responsiveness: Always fast ✨

πŸ“š Documentation

Document Description
SETUP_GUIDE.md Step-by-step setup instructions
WINDOWS_SERVER_SETUP.md Complete Windows PC setup
LLM_SERVER_ARCHITECTURE.md Architecture & use cases
GPU_OPTIMIZATION_GUIDE.md Local GPU optimization
FINAL_SUMMARY.md Complete project summary

πŸ› οΈ Requirements

Mac Mini:

  • macOS (any recent version)
  • 8GB+ RAM
  • Network connection to Windows PC

Windows PC:

  • Windows 10/11
  • 16GB+ RAM (32GB recommended)
  • LM Studio or Ollama installed

Network:

  • Both on same local network
  • Firewall allows port 1234 (LM Studio) or 11434 (Ollama)

πŸš€ Quick Commands

# Mac Mini:
~/connect-to-llm-server.sh     # First time setup
~/chat-with-server.sh          # Interactive chat
~/code-helper.sh               # Code assistance
source ~/.llm_server_config    # Load server settings

# Windows PC (PowerShell):
ipconfig                       # Get IP address
ollama serve                   # Start Ollama server
ollama list                    # List available models

πŸ› Troubleshooting

Can't connect to server?

On Windows PC:

# Check if server is running
curl http://localhost:1234/v1/models    # LM Studio
curl http://localhost:11434/api/tags    # Ollama

# Check firewall
New-NetFirewallRule -DisplayName "LM Studio" -Direction Inbound -LocalPort 1234 -Protocol TCP -Action Allow

On Mac Mini:

# Test connection
ping YOUR_WINDOWS_IP
curl http://YOUR_WINDOWS_IP:1234/v1/models

# Reconfigure
~/connect-to-llm-server.sh

πŸ“¦ Recommended Models

For 32GB Windows PC:

Code Generation (Best):

  • qwen2.5-coder:32b - Best code quality
  • codellama:34b - Meta's coding model
  • deepseek-coder:33b - Strong reasoning

General Purpose:

  • llama3.1:70b - Huge context window
  • qwen2.5:72b - Top-tier reasoning
  • mixtral:8x22b - Fast multi-expert

Fast & Lightweight:

  • qwen2.5-coder:14b - Good balance
  • llama3.1:8b - Quick responses

🎊 What This Gives You

Daily Workflow:

  1. Morning: Windows PC auto-starts LM Studio/Ollama
  2. Code: Mac Mini stays fast, all processing on Windows
  3. AI Help: Cmd+L in VS Code or ~/code-helper.sh in terminal
  4. Results: Better quality (32B models) + Faster Mac

Perfect For:

  • βœ… Full-stack developers
  • βœ… Anyone with limited Mac RAM
  • βœ… Teams sharing an LLM server
  • βœ… People who want best AI quality

🀝 Contributing

Found a bug or have a suggestion? Open an issue or PR!


πŸ“„ License

MIT License - Use freely!


πŸ™ Credits

Built for developers who want the best of both worlds:

  • Fast, responsive Mac for coding
  • Powerful Windows PC for AI processing

πŸ”— Links


Ready to supercharge your coding workflow? Start with SETUP_GUIDE.md! πŸš€

About

Mac Mini + Windows PC LLM Server Setup - Run large AI models on Windows PC, code fast on Mac Mini

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages