Process bare metal job results by epompeii · Pull Request #680 · bencherdev/bencher

epompeii · 2026-02-26T06:51:14Z

This changeset add processing of bare metal job results.
The Report is created initially. Then once the job is completed, results are connected to the Report.

github-actions · 2026-02-26T06:53:06Z

🤖 Claude Code Review

PR: #680
Base: devel
Head: u/ep/job-results
Commit: 02352378efab7e76de5c5247b078228b02199c3c

Pull Request Review: Job Processing & Multi-Iteration Support

Overview

This PR (5 commits, 85 files, +4941/-1563) adds:

Multi-iteration job execution support (Iteration type, JsonIterationOutput)
New Processed job status (between Completed and Failed)
Shared WebSocket wire protocol types (RunnerMessage, ServerMessage, CloseReason)
Consolidated auth helpers (bearer_header, strip_bearer_token)
Comprehensive test expansion across all test suites

High Severity

1. Missing `#[cfg(feature = "plus")]` — will break `cargo check --no-default-features`

lib/bencher_schema/src/model/runner/job.rs

Three functions reference plus-only APIs (context.licensor, context.oci_storage(), PublicUser) but lack the feature gate:

QueryJob::process_results
reprocess_completed_jobs
reprocess_single_completed_job

The nearby mark_orphaned_completed_as_failed correctly has #[cfg(feature = "plus")], showing the pattern exists but was inconsistently applied. CLAUDE.md requires cargo check --no-default-features to pass.

2. `JobStatus` discriminant shift — existing DB rows will be misinterpreted without migration

lib/bencher_json/src/runner/job_status.rs

The integer mappings changed: Failed moved from 4→5, Canceled from 5→6, and Processed=4 was inserted. The migration 2026-02-27-120000_job_processed/up.sql correctly remaps values (UPDATE job SET status = 6 WHERE status = 5; UPDATE job SET status = 5 WHERE status = 4). Verify this migration runs before the new code is deployed — without it, existing Failed (4) rows would be read as Processed.

3. `JsonJobOutput` backwards compatibility broken

lib/bencher_json/src/runner/job.rs

The old flat format ({"exit_code":0,"stdout":"hello","stderr":"world","output":{...}}) was replaced with {"results":[...]}. The old backwards_compat_completed_json test now tests the new format, not the old one. If any old-format blobs exist in OCI storage, they will fail to deserialize. Either add a #[serde(untagged)] fallback deserializer or verify no old-format data exists in production.

Medium Severity

4. `allow_failure` with all iterations erroring silently succeeds

plus/bencher_runner/src/up/job.rs:108-119

When allow_failure=true and every iteration fails, the results vector is empty and the job is sent as RunnerMessage::Completed { results: [] }. This produces zero metrics rather than a failure. Consider at minimum documenting this behavior, or treating all-iterations-failed as an error even with allow_failure.

5. `handle_running` returns `Ack` for unexpected non-canceled states

plus/api_runners/src/jobs/websocket.rs:376-378

When the Running transition fails and the job is not canceled, the handler logs a warning and returns Ok(None) (which sends Ack). The runner will think the transition succeeded and proceed to execute. Consider returning an error response or ServerMessage::Failed instead.

6. `Iteration(0)` is allowed but semantically meaningless

lib/bencher_json/src/runner/job.rs, lib/bencher_json/src/project/report.rs

Iteration::from_str("0") succeeds. Callers guard with job.config.iter.map_or(1, ...) but the type itself permits zero. Consider validating iter >= 1 at the type/deserialization layer per CLAUDE.md guidance on strong types.

Low Severity

7. `write_conn!` used for read-only query

lib/bencher_schema/src/model/runner/job.rs:918 — reprocess_completed_jobs uses write_conn!(context) for a SELECT query. Should use auth_conn! to avoid unnecessarily acquiring the exclusive write lock.

8. Test doc comment mismatch

plus/api_runners/tests/jobs/websocket.rs:950 — Comment says "Output message with stdout only" but the test is channel_completed_with_stderr_only. Minor copy-paste error.

9. `last_exit_code` defaults to 0

plus/bencher_runner/src/up/job.rs:78 — If no iterations execute (e.g., immediate cancel), last_exit_code remains 0, suggesting success. Mitigated by the cancel path returning JobOutcome::Canceled.

10. Missing `#[typeshare]` / `JsonSchema` on WebSocket types

lib/bencher_json/src/runner/websocket.rs — RunnerMessage, ServerMessage, and CloseReason lack #[typeshare::typeshare] and JsonSchema derives. Fine if these are purely server/runner-side, but worth confirming they don't need TypeScript or OpenAPI representation.

Positive Observations

Excellent test coverage — The WebSocket test suite is comprehensive: auth, happy path, cancellation, race conditions (concurrent cancel during Running, idempotent duplicates), timeouts, token rotation, and multi-iteration variants.
Clean UpdateJob helper refactoring — terminate, set_status, heartbeat, start, claim, execute_if_status, execute_if_either_status reduce duplication and encapsulate TOCTOU-safe conditional UPDATEs.
Proper wire protocol consolidation — Moving RunnerMessage/ServerMessage/CloseReason into bencher_json eliminates client/server type drift.
Security improvements — strip_bearer_token uses case-insensitive comparison per RFC 7235; BLOCKED_ENV_VARS prevents LD_PRELOAD injection; WebSocket message size is bounded.
Good concurrency model — Conditional UPDATEs with status filters prevent TOCTOU races; heartbeat timeout safety nets catch disconnected runners.
CLAUDE.md compliance — thiserror throughout (no anyhow), #[expect] over #[allow], strong types (JobId, JobUuid, Iteration, DateTime), BTreeMap for deterministic serialization, camino paths, proper destructuring in TryFrom impls, deterministic time in tests via Clock::Custom and tokio::time::pause().
Migration quality — Correct reverse-order remapping to avoid collisions; report_head_index well-documented.

Action Items

Priority	Item
Must fix	Add `#[cfg(feature = "plus")]` to `process_results`, `reprocess_completed_jobs`, `reprocess_single_completed_job` (#1)
Must verify	Migration `2026-02-27-120000_job_processed` runs before deploy (#2)
Must verify	No old-format `JsonJobOutput` blobs exist in production OCI storage (#3)
Should fix	Handle unexpected state in `handle_running` more explicitly (#5)
Should fix	Fix test doc comment (#8)
Consider	Validate `Iteration >= 1` at type level (#6)
Consider	Document `allow_failure` + all-fail behavior (#4)
Consider	Use `auth_conn!` for read query (#7)

_{Model: claude-opus-4-6}

github-actions · 2026-02-26T13:13:15Z

Bencher Report

Branch	u/ep/job-results
Testbed	ubuntu-22.04

Click to view all benchmark results

Benchmark	Latency	Benchmark Result microseconds (µs) (Result Δ%)	Upper Boundary microseconds (µs) (Limit %)
Adapter::Json	📈 view plot 🚷 view threshold	3.75 µs (+14.94%) Baseline: 3.26 µs	4.42 µs (84.92%)
Adapter::Magic (JSON)	📈 view plot 🚷 view threshold	3.72 µs (+13.58%) Baseline: 3.27 µs	4.38 µs (85.00%)
Adapter::Magic (Rust)	📈 view plot 🚷 view threshold	25.75 µs (+2.14%) Baseline: 25.21 µs	31.28 µs (82.34%)
Adapter::Rust	📈 view plot 🚷 view threshold	2.99 µs (+6.96%) Baseline: 2.80 µs	3.49 µs (85.76%)
Adapter::RustBench	📈 view plot 🚷 view threshold	2.98 µs (+6.77%) Baseline: 2.80 µs	3.47 µs (85.95%)
head_version_insert/batch/10	📈 view plot 🚷 view threshold	106.63 µs (+11.21%) Baseline: 95.88 µs	128.79 µs (82.79%)
head_version_insert/batch/100	📈 view plot 🚷 view threshold	245.60 µs (+6.02%) Baseline: 231.66 µs	270.06 µs (90.94%)
head_version_insert/batch/255	📈 view plot 🚷 view threshold	469.51 µs (+2.89%) Baseline: 456.33 µs	500.96 µs (93.72%)
head_version_insert/batch/50	📈 view plot 🚷 view threshold	164.54 µs (+5.92%) Baseline: 155.35 µs	186.77 µs (88.10%)
threshold_query/join/10	📈 view plot 🚷 view threshold	154.21 µs (+11.25%) Baseline: 138.62 µs	174.67 µs (88.28%)
threshold_query/join/20	📈 view plot 🚷 view threshold	167.97 µs (+10.09%) Baseline: 152.57 µs	188.71 µs (89.01%)
threshold_query/join/5	📈 view plot 🚷 view threshold	143.25 µs (+9.22%) Baseline: 131.16 µs	163.95 µs (87.38%)
threshold_query/join/50	📈 view plot 🚷 view threshold	215.68 µs (+11.64%) Baseline: 193.19 µs	232.94 µs (92.59%)

🐰 View full continuous benchmarking report in Bencher

epompeii self-assigned this Feb 26, 2026

epompeii added claude deploy labels Feb 26, 2026

epompeii force-pushed the u/ep/job-results branch from a0485e0 to 252973f Compare February 27, 2026 05:32

epompeii added 2 commits February 27, 2026 18:35

job_results

2374439

job_processed

f99da57

epompeii force-pushed the u/ep/job-results branch from 252973f to f99da57 Compare February 28, 2026 04:46

epompeii marked this pull request as ready for review February 28, 2026 04:47

self_review

4436910

epompeii force-pushed the u/ep/job-results branch from 1ddb2a9 to 4436910 Compare February 28, 2026 05:54

claude_1

12a13bc

epompeii force-pushed the u/ep/job-results branch from 2d42d33 to 12a13bc Compare February 28, 2026 07:10

claude_2

0235237

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process bare metal job results#680

Process bare metal job results#680
epompeii wants to merge 5 commits intodevelfrom
u/ep/job-results

epompeii commented Feb 26, 2026

Uh oh!

github-actions bot commented Feb 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

epompeii commented Feb 26, 2026

Uh oh!

github-actions bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Claude Code Review

Pull Request Review: Job Processing & Multi-Iteration Support

Overview

High Severity

1. Missing #[cfg(feature = "plus")] — will break cargo check --no-default-features

2. JobStatus discriminant shift — existing DB rows will be misinterpreted without migration

3. JsonJobOutput backwards compatibility broken

Medium Severity

4. allow_failure with all iterations erroring silently succeeds

5. handle_running returns Ack for unexpected non-canceled states

6. Iteration(0) is allowed but semantically meaningless

Low Severity

7. write_conn! used for read-only query

8. Test doc comment mismatch

9. last_exit_code defaults to 0

10. Missing #[typeshare] / JsonSchema on WebSocket types

Positive Observations

Action Items

Uh oh!

github-actions bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Feb 26, 2026 •

edited

Loading

1. Missing `#[cfg(feature = "plus")]` — will break `cargo check --no-default-features`

2. `JobStatus` discriminant shift — existing DB rows will be misinterpreted without migration

3. `JsonJobOutput` backwards compatibility broken

4. `allow_failure` with all iterations erroring silently succeeds

5. `handle_running` returns `Ack` for unexpected non-canceled states

6. `Iteration(0)` is allowed but semantically meaningless

7. `write_conn!` used for read-only query

9. `last_exit_code` defaults to 0

10. Missing `#[typeshare]` / `JsonSchema` on WebSocket types

github-actions bot commented Feb 26, 2026 •

edited

Loading