docs: add synthetic benchmarking plan for library scanning performance

pythoninthegrass · pythoninthegrass · commit 93d580fc98ce · 2026-01-17T04:33:34.000-06:00
- Create task-164 for implementing benchmark suite with targets:
  - Initial import ~41k tracks: &lt;5 min (stretch &lt;60s)
  - No-op rescan: &lt;10s
  - Incremental rescan: proportional to changes

- Update task-012 to depend on task-164 (benchmark before optimize)

- Add comprehensive docs/benchmark.md covering:
  - Current architecture analysis and bottlenecks
  - Required 2-phase scanning with fingerprint storage
  - Synthetic dataset strategies (shape-only, clone-based, pathological)
  - Benchmark scenarios and tooling design
  - Taskfile integration plan
  - Safety guarantees and result interpretation
diff --git a/backlog/tasks/task-012 - Implement-performance-optimizations.md b/backlog/tasks/task-012 - Implement-performance-optimizations.md
@@ -4,16 +4,24 @@ title: Implement performance optimizations
 status: In Progress
 assignee: []
 created_date: '2025-09-17 04:10'
-updated_date: '2026-01-16 22:22'
+updated_date: '2026-01-17 10:30'
 labels: []
-dependencies: []
-ordinal: 27500
+dependencies:
+  - task-164
+ordinal: 12250
 ---
 
 ## Description
 
 <!-- SECTION:DESCRIPTION:BEGIN -->
-Optimize directory traversal, database operations, and network caching for better performance
+Optimize directory traversal, database operations, and network caching for better performance.
+
+**IMPORTANT**: Before implementing optimizations, complete task-164 (synthetic benchmarking) to establish baselines and validate that proposed changes actually improve performance. Premature optimization without measurement is risky for a 267GB / 41k track library.
+
+Performance targets (from benchmarking):
+- Initial import of ~41k tracks: < 5 minutes (stretch: < 60s)
+- No-op rescan (unchanged library): < 10s
+- Incremental rescan (1% delta): proportional to changes
 <!-- SECTION:DESCRIPTION:END -->
 
 ## Acceptance Criteria
diff --git a/backlog/tasks/task-164 - Implement-synthetic-library-benchmarking-for-scan-performance.md b/backlog/tasks/task-164 - Implement-synthetic-library-benchmarking-for-scan-performance.md
@@ -0,0 +1,54 @@
+---
+id: task-164
+title: Implement synthetic library benchmarking for scan performance
+status: In Progress
+assignee: []
+created_date: '2026-01-17 10:29'
+updated_date: '2026-01-17 10:32'
+labels:
+  - performance
+  - testing
+  - scanning
+dependencies: []
+priority: high
+ordinal: 6125
+---
+
+## Description
+
+<!-- SECTION:DESCRIPTION:BEGIN -->
+Create a benchmarking suite to measure and validate library scanning performance before optimizing. The benchmark must prove the architecture can meet targets:
+- Initial import of ~41k tracks: < 5 minutes (stretch: < 60s)
+- No-op rescan (unchanged library): < 10s
+- Incremental rescan (1% delta): proportional to changes
+
+Key architectural requirement: scanning must use a 2-phase approach:
+1. Phase 1 (inventory): walk + stat (mtime_ns, size) + DB diff — no tag parsing
+2. Phase 2 (parse delta): mutagen only for added/changed files
+
+This requires storing fingerprints (file_mtime_ns, file_size) in the library table.
+
+Benchmarking approach:
+- Dataset A (shape-only): 41k tiny files for traversal/DB stress testing
+- Dataset B (clone-based): APFS clones of ~400 real seed files to 41k paths for realistic mutagen timing
+- Dataset C (pathological): edge cases (2k+ files in one dir, deep nesting, corrupt files, unicode)
+
+Scenarios to benchmark:
+1. Initial import (fresh DB)
+2. No-op rescan (same DB, no changes)
+3. Delta rescan (add 200, touch 200, delete 10)
+
+See docs/benchmark.md for comprehensive planning details.
+<!-- SECTION:DESCRIPTION:END -->
+
+## Acceptance Criteria
+<!-- AC:BEGIN -->
+- [ ] #1 tests/bench/ directory created with benchmark scripts
+- [ ] #2 make_synth_library.py generates Dataset A (shape) and Dataset B (clone) libraries
+- [ ] #3 bench_scan.py measures walk, stat, DB diff, parse, and DB write phases separately
+- [ ] #4 Taskfile tasks added: bench:make:shape, bench:make:clone, bench:scan:initial, bench:scan:noop, bench:scan:delta, bench:scan:full
+- [ ] #5 All benchmarks use isolated DB path (/tmp/mt-bench/mt.db) - never touch production DB
+- [ ] #6 Benchmark outputs JSON + human-readable metrics: times, counts, throughput, peak RSS
+- [ ] #7 library table schema updated with file_mtime_ns column for fingerprint storage
+- [ ] #8 Optional: bench:zig:walk task to compare Zig traversal ceiling vs Python
+<!-- AC:END -->
diff --git a/docs/benchmark.md b/docs/benchmark.md