Skip to content

[Batch 6] Performance benchmark harness + regression detection #393

@MichaelFisher1997

Description

@MichaelFisher1997

Summary

Create an automated benchmark harness that loads a world, follows a fixed camera path, and records frame time metrics across all presets. Integrate into CI for performance regression detection.

Depends on: #385 (all presets must be defined before benchmarking)

Requirements

Benchmark Tool

  • New executable target: zig build benchmark
  • Loads world at seed 12345 (deterministic terrain)
  • Flies camera along a fixed path (waypoints)
  • Records per-frame: GPU time, CPU time, draw calls, vertex count, chunk count
  • Runs for N seconds (configurable, default 60s)
  • Outputs: JSON results file

Fixed Camera Path

const BENCH_PATH = [_]Waypoint{
    .{ .pos = Vec3.init(8, 100, 8), .look = Vec3.init(1, 0, 0), .duration = 5.0 },
    .{ .pos = Vec3.init(200, 150, 200), .look = Vec3.init(0, -0.3, 1), .duration = 10.0 },
    .{ .pos = Vec3.init(-500, 80, 300), .look = Vec3.init(1, 0, -1), .duration = 10.0 },
    // ... covers multiple biomes, directions, elevations
};
  • Interpolate between waypoints (smooth camera movement)
  • Covers: plains, mountains, ocean, forest, desert biomes
  • Covers: looking up, looking down, spinning

Metrics Collection

{
  "preset": "high",
  "render_distance": 100,
  "frames": 3600,
  "duration_s": 60.0,
  "fps": {
    "min": 42,
    "avg": 67,
    "max": 120,
    "p1": 45,
    "p5": 51,
    "p50": 68,
    "p95": 89,
    "p99": 102
  },
  "gpu_ms": {
    "shadow_avg": 2.1,
    "opaque_avg": 4.3,
    "total_avg": 12.8
  },
  "draw_calls_avg": 2340,
  "vertices_avg": 2100000,
  "chunks_rendered_avg": 890
}

CI Integration

  • New workflow: .github/workflows/benchmark.yml
  • Triggers on: push to dev, manual dispatch
  • Runs benchmark at LOW, MEDIUM, HIGH presets
  • Compares against baseline (stored in repo or artifact)
  • Posts results as PR comment or commit status
  • Regression threshold: 10% FPS drop = warning, 20% = failure

Baseline Management

  • Store baseline JSON in docs/benchmarks/baseline.json
  • Update baseline when intentional performance changes land
  • Compare: scripts/compare_benchmarks.sh baseline.json new.json

Implementation Plan

Step 1: Benchmark executable

  • src/benchmark.zig: main entry, world setup, camera path
  • Reuse existing -Dsmoke-test infrastructure for headless world loading
  • Add waypoint system with smooth interpolation
  • Run fixed duration, collect metrics

Step 2: Metrics collector

  • Wrap existing GPU timing infrastructure (rhi_timing.zig)
  • Accumulate frame times in arrays
  • Compute percentiles and averages at end
  • Write JSON output

Step 3: Script runner

  • scripts/run_benchmark.sh: runs benchmark across presets, collects results
  • Configurable: duration, seed, presets to test
  • Output: one JSON per preset, summary report

Step 4: Comparison tool

  • scripts/compare_benchmarks.sh: diff two JSON files
  • Print regression/improvement per metric
  • Exit code: 0 if no regression, 1 if regression exceeds threshold

Step 5: CI workflow

  • Runs on ubuntu-latest with Vulkan software renderer (or skip GPU tests)
  • Alternatively: runs on self-hosted GPU runner
  • Posts results as artifact

Files to Create

  • src/benchmark.zig — benchmark executable
  • scripts/run_benchmark.sh — runner script
  • scripts/compare_benchmarks.sh — comparison tool
  • docs/benchmarks/baseline.json — initial baseline

Files to Modify

  • build.zig — add benchmark build target
  • .github/workflows/benchmark.yml — CI workflow (optional, can be follow-up)

Testing

  • Benchmark runs to completion at each preset
  • JSON output is valid and contains all metrics
  • Comparison tool detects regressions correctly
  • Results are reproducible (±5% variance across runs)
  • CI workflow runs without crashing

Roadmap: docs/PERFORMANCE_ROADMAP.md — Batch 6, Polish

Metadata

Metadata

Assignees

No one assigned

    Labels

    batch-6Batch 6: CapstonebugSomething isn't workingbuildcidocumentationImprovements or additions to documentationenhancementNew feature or requesthotfix

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions