A high-performance, production-grade traffic simulation infrastructure designed to replicate historical data load patterns with 93% precision from production environments for active backend testing.
This tool automates the process of reading massive amounts of historical raw logs, storing them efficiently, parsing data structures into normalized entities, and executing controlled traffic simulation workloads onto production-like cloud databases.
- Logs Source: Ingests raw query data files (CSV, JSON, LOG).
- File Reader: Parses and streams raw records sequentially.
- MongoDB Storage: Stores unstructured raw records for fast staging.
- Parser Pipeline: Normalizes staging entities into predefined structural formats (
ParsedQuery). - Traffic Simulator: Controls precise load timing to replay traffic patterns realistically.
- Formatter & Runner: Builds and streams finalized data blocks directly into Google BigQuery.
- Consul Integration: Manages active system configurations and process checkpoint states.
The system includes built-in metric collection pipelines connected directly to Datadog for full end-to-end operational visibility.
Metrics monitored include:
- Real-time processing latency and query median duration (
loadtool.simulated.realtime.median). - Pipeline ingest success rates (
loadtool.log.success). - Total record throughput successfully delivered to target endpoints (
loadtool.records.sent).
- Language: Go (Golang)
- Databases: MongoDB, Google BigQuery
- Infrastructure & DevOps: Docker, Kubernetes, Consul
- Monitoring: Datadog