AI Performance Benchmark Tool

A professional-grade benchmarking suite designed to validate the full AI capabilities of Apple Silicon (M1/M2/M3/M4). It unifies CPU, GPU (Metal), and NPU (Neural Engine) testing into a single tool.

This script implements advanced engineering strategies like Cache-Resident Inference and INT8 Weight Quantization to measure the true potential of the Apple Neural Engine (ANE).

🚀 Key Features

Full Spectrum Analysis:
- CPU (NumPy): Measures raw floating-point performance (GFLOPS).
- GPU (Metal MPS): Tests Compute (FP32) and Inference (FP16) throughput.
- NPU (CoreML): Utilizes coremltools to bypass Python bottlenecks and access the Neural Engine directly.
Advanced NPU Strategies:
- Cache-Resident Model: Uses deep layers with small tensors (32x32) to prevent RAM bottlenecks and saturate the NPU's internal SRAM.
- INT8 Quantization: Applies linear weight quantization to unlock the Neural Engine's acceleration logic.
Hardware Detection: Auto-detects Physical/Logical cores, GPU Core count, and NPU driver status.

Prerequisites

Python 3.10 or 3.11 (Required for TensorFlow/PyTorch compatibility on macOS ARM64).
Architecture: ARM64 (Apple Silicon) or x86_64.

Installation & Setup

Create a clean virtual environment:

# Verify you are using Python 3.10 or 3.11
python3.10 -m venv venv

# Activate the environment
source venv/bin/activate

Install dependencies: Note: We strictly require numpy<2 to prevent conflicts with TensorFlow.
```
pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
```

Usage

Run the script directly from your terminal:

python benchmark-ai.py

Understanding the Output

CPU Baseline (GFLOPS): Standard floating-point performance on the processor.

FP32 (TFLOPS): Raw GPU Compute power. High precision, used for training or scientific calc.

FP16 (TOPS): AI Inference power. Lower precision, faster speed. This metric aligns closer with NPU/Neural Engine marketing specs.

📝 Example Result (Apple M4 Pro)

================================================================================
🚀  AI BENCHMARK PRO
================================================================================
OS: Darwin 24.6.0 | RAM: 24.0 GB
CPU: arm (12 Physical / 12 Logical)
GPU: MPS (16 Cores) | NPU: Enabled
--------------------------------------------------------------------------------
[1] CPU BASELINE (FP32)... 340.12 GFLOPS
[2] GPU METAL (FP16)...... 7.83 TOPS
[3] NPU NEURAL (FP16)..... 14.34 TOPS
[4] NPU NEURAL (INT8)..... 18.23 TOPS

================================================================================
🏆  INFORME TÉCNICO DE RENDIMIENTO
================================================================================
• CPU (General Processing):      340.12 GFLOPS
• GPU (Graphics / Basic AI):     7.83 TOPS
• NPU (High Precision AI):       14.34 TOPS
• NPU (Quantized AI W8A16):      18.23 TOPS
--------------------------------------------------------------------------------
NOTE: The ~18.23 TOPS result represents ~50% of the theoretical peak.
This is the maximum speed achievable without a calibration dataset (W8A16 mode).
To reach full TOPS (W8A8), a real-world trained model with activation
quantization is required.
================================================================================

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_ES.md		README_ES.md
benchmark-ai.py		benchmark-ai.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Performance Benchmark Tool

🚀 Key Features

Prerequisites

Installation & Setup

Usage

Understanding the Output

📝 Example Result (Apple M4 Pro)

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

CodeGeekR/benchmark-ai-tops

Folders and files

Latest commit

History

Repository files navigation

AI Performance Benchmark Tool

🚀 Key Features

Prerequisites

Installation & Setup

Usage

Understanding the Output

📝 Example Result (Apple M4 Pro)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages