The benchmark pipeline (asap-tools/execution-utilities/benchmark/) currently uses Arroyo's single_file_custom connector to replay local dataset files (ClickBench, H2O) through the sketch pipeline. Since precompute_engine is intended to replace Arroyo + asap-summary-ingest, the benchmark needs an equivalent way to feed file data into it.
precompute_engine currently accepts data only via two HTTP endpoints:
POST /api/v1/write — Prometheus remote write (Snappy + Protobuf)
POST /api/v1/import — VictoriaMetrics remote write (Zstd + Protobuf)
There is no way to point the engine at a local JSON or CSV file and have it ingest the data directly. Running the benchmark against precompute_engine therefore requires an out-of-process HTTP sender that re-encodes file rows into the wire format — adding external dependencies and unnecessary round-trip overhead that the Arroyo file connector did not have.
Prerequisite: #300
The benchmark pipeline (
asap-tools/execution-utilities/benchmark/) currently uses Arroyo'ssingle_file_customconnector to replay local dataset files (ClickBench, H2O) through the sketch pipeline. Sinceprecompute_engineis intended to replace Arroyo +asap-summary-ingest, the benchmark needs an equivalent way to feed file data into it.precompute_enginecurrently accepts data only via two HTTP endpoints:POST /api/v1/write— Prometheus remote write (Snappy + Protobuf)POST /api/v1/import— VictoriaMetrics remote write (Zstd + Protobuf)There is no way to point the engine at a local JSON or CSV file and have it ingest the data directly. Running the benchmark against
precompute_enginetherefore requires an out-of-process HTTP sender that re-encodes file rows into the wire format — adding external dependencies and unnecessary round-trip overhead that the Arroyo file connector did not have.Prerequisite: #300