This directory contains the SQLite database index for fast hunt queries.
The hunts.db file is an automatically-maintained index of all hunt files in the repository. It enables:
- ⚡ 2.7x faster duplicate detection
- 🔍 Fast filtered queries by tactic, technique, or tags
- 📊 Quick statistics and analytics
- 🤖 Reduced GitHub Actions runtime
Important: The markdown files in Flames/, Embers/, and Alchemy/ remain the source of truth. This database is just an index for performance.
CREATE TABLE hunts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
filename TEXT UNIQUE NOT NULL, -- e.g., "H-2025-001.md"
hunt_id TEXT NOT NULL, -- e.g., "H-2025-001"
hypothesis TEXT NOT NULL, -- The hunt hypothesis
tactic TEXT, -- MITRE ATT&CK tactic
technique TEXT, -- MITRE technique ID (e.g., "T1071.001")
tags TEXT, -- JSON array of hashtags
submitter TEXT, -- Contributor name
file_path TEXT NOT NULL, -- Full path to .md file
file_hash TEXT NOT NULL, -- MD5 hash for change detection
created_date TEXT, -- ISO timestamp from git
last_modified TEXT -- ISO timestamp from git
);The database is automatically updated by the update-hunt-database.yml workflow whenever:
- A hunt file is added to
Flames/,Embers/, orAlchemy/ - A hunt file is modified
- A hunt file is deleted
No manual intervention is required.
python scripts/build_hunt_database.py --rebuildpython scripts/build_hunt_database.py# Count all hunts
sqlite3 database/hunts.db "SELECT COUNT(*) FROM hunts"
# List all Defense Evasion hunts
sqlite3 database/hunts.db "SELECT hunt_id, hypothesis FROM hunts WHERE tactic = 'Defense Evasion'"
# Get statistics
sqlite3 database/hunts.db "SELECT tactic, COUNT(*) as count FROM hunts GROUP BY tactic ORDER BY count DESC"
# Find hunts by technique
sqlite3 database/hunts.db "SELECT * FROM hunts WHERE technique LIKE 'T1071%'"Performance test results (on 69 hunts):
| Operation | File-based | Database | Speedup |
|---|---|---|---|
| Get all hunts | 7.77ms | 2.90ms | 2.7x faster |
| Filtered query | N/A | 0.14ms | Near instant |
| Full duplicate check | ~15s | ~0.5s | 30x faster |
In GitHub Actions with slower I/O, speedup is even more dramatic (estimated 50-100x for full duplicate detection).
- Current size: 72KB for 69 hunts
- Expected growth: ~1KB per hunt
- At 1000 hunts: ~1MB (still very manageable)
If the database seems out of sync with the markdown files:
# Rebuild from scratch
python scripts/build_hunt_database.py --rebuildThe system automatically falls back to file-based retrieval if the database doesn't exist. To create it:
python scripts/build_hunt_database.pyIf you get a merge conflict on hunts.db:
- Accept either version (doesn't matter which)
- Run
python scripts/build_hunt_database.py --rebuild - Commit the rebuilt database
- ✅ Zero-config (no server needed)
- ✅ Version-controllable (single file)
- ✅ Fast for read-heavy workloads
- ✅ No external dependencies in GitHub Actions
- ✅ Perfect for <10K records
- ❌ Would require secrets management
- ❌ Adds latency
- ❌ Requires external infrastructure
- ❌ Not needed for this workload
- ✅ Works in GitHub Actions without rebuild
- ✅ Instant availability for contributors
- ✅ Version history is preserved
- ✅ No cold-start penalty
Possible future improvements:
- Add semantic embeddings for better duplicate detection
- Full-text search index
- Materialized views for common queries
- Export to JSON for website consumption
- Integration with MITRE ATT&CK Navigator
See the main HEARTH README or open an issue.