Releases · blooop/bencher · GitHub

27 Apr 17:17

v1.98.0 Latest

Latest

Added

aggregate, agg_fn, and repeats parameters for optimize(), matching the plot_sweep() API. Aggregated dimensions are looped inside the Optuna objective so the optimizer sees robust metrics (e.g. mean loss across seeds or repeated boolean outcomes).
AGG_FN_MAP in bencher/utils.py — NaN-safe numpy aggregation functions for objective-level aggregation.
Example example_optimize_aggregate.py demonstrating sweep-then-optimize with dimension aggregation and repeats.

Fixed

Missing skipna=True on REDUCE and MINMAX repeat aggregation in bench_result_base.py.
np.mean → np.nanmean in optuna_result.py aggregation to match xarray's NaN-safe behavior.

Assets 2

27 Apr 15:37

v1.97.0

Fixed

aggregate=True no longer duplicates pane-type results (rerun, image, video). Pane results store file paths that cannot be numerically aggregated, so they now only render in the non-aggregated view.
Line plotter crash when aggregating: plt_cnt_cfg still referenced collapsed dimensions, causing holoviews DataError on missing dimension names. Swapped to post-aggregation config during map_plot_panes calls.
remove_plots no longer raises ValueError when combined with numeric_only.

Changed

Renamed VideoResult to PaneResult to reflect that it handles all pane types (rerun, image, video), not just video.

Added

Image and video aggregate examples (example_result_image_aggregate, example_result_video_aggregate) to exercise and demonstrate pane-result aggregation.
omega_n sweep added to ControlSystemSweep for multi-input rerun testing.

Assets 2

26 Apr 19:26

v1.96.0

Fixed

Rerun viewer panes now work in saved HTML reports (show="html" / ShowMode.HTML). Previously the viewer failed because browsers block fetch() from file:// origins. The .rrd data is now base64-encoded directly into the viewer HTML page and loaded via the rerun open_channel() / send_rrd() API, bypassing the fetch entirely.
Multi-tab reports with rerun panes: tab files in _tabs/ now correctly reference ../_rrd/ instead of _rrd/, fixing broken relative paths.

Assets 2

26 Apr 13:10

v1.95.0

Fixed

Rerun viewer panes now work in saved HTML reports (show="html" / ShowMode.HTML). Previously the viewer failed because browsers block fetch() from file:// origins. The .rrd data is now base64-encoded directly into the viewer HTML page and loaded via the rerun open_channel() / send_rrd() API, bypassing the fetch entirely.
Multi-tab reports with rerun panes: tab files in _tabs/ now correctly reference ../_rrd/ instead of _rrd/, fixing broken relative paths.

Assets 2

25 Apr 10:16

v1.94.0

Fixed

Rerun viewer panes now work in saved HTML reports (show="html" / ShowMode.HTML). Previously the viewer failed because browsers block fetch() from file:// origins. The .rrd data is now base64-encoded directly into the viewer HTML page and loaded via the rerun open_channel() / send_rrd() API, bypassing the fetch entirely.
Multi-tab reports with rerun panes: tab files in _tabs/ now correctly reference ../_rrd/ instead of _rrd/, fixing broken relative paths.

Assets 2

25 Apr 09:03

v1.93.0

Added

ShowMode StrEnum (live, html, published, none) exported from the top-level bencher package. bn.run(show=bn.ShowMode.HTML) gives autocomplete and typo detection while plain strings (show="html") and booleans (show=True) keep working. The old "static" spelling is accepted as an alias for ShowMode.HTML.

Changed

The show parameter on bn.run(), BenchRunner.run(), BenchRunner.show(), and BenchPlotSrvCfg now accepts ShowMode in addition to bool | str.
Renamed the "static" display mode to "html" ("static" remains supported via alias).

Assets 2

22 Apr 14:18

v1.92.0

Added

Public MethodCells dataclass and method_cells(result) helper in bencher.regression, re-exported from the top-level bencher package. Downstream report builders can now call method_cells(r) to get pre-rendered, method-aware display strings (change, baseline, threshold, summary lead) for a RegressionResult and embed them in a custom layout — custom columns, non-markdown output, CI comments with status decoration, etc. — without reimplementing per-method dispatch (and drifting when new detection methods are added).

Removed

The private names _MethodCells / _method_cells are gone. Update callers to the public MethodCells / method_cells.

Assets 2

22 Apr 12:27

v1.91.0

Added

Regression report is now auto-embedded as a Markdown panel at the top of to_auto_plots() whenever regression_report.has_regressions is true. Previously only the per-variable overlay plots were injected, so absolute-method fires (which have no history/overlay) were silent in the report.

Changed

Regression report rendering (RegressionReport.summary() and to_markdown()) now dispatches per method so each row describes its actual gate:
- percentage: threshold shown as ±T%.
- adaptive: threshold shown as Tσ (change remains in percent).
- delta: Change column shows the raw Δ (not percent, since the gate is in absolute units); threshold rendered as ±T.
- absolute: Change and Baseline cells rendered as em-dash (no historical baseline); Threshold cell carries the direction-aware inequality (≤ L for OptDir.minimize, ≥ L for OptDir.maximize). Summary line phrased as current=X vs ceiling|floor=Y.

Fixed

RegressionResult.summary() / RegressionReport.to_markdown() no longer render +nan% or mislabel the hard limit as Baseline for regression_method="absolute" results.

Assets 2

22 Apr 11:21

v1.90.0

Added

sampling_context parameter on bn.run(): an optional context manager that wraps only the sampling phase. Its __exit__ is guaranteed to run before the Panel/Bokeh server starts, so external resources (DB pools, GPU handles, simulators) are released while nothing blocks. save and publish still execute inside the context. Defaults to None (fully backward-compatible).

Assets 2

21 Apr 15:44

v1.89.0

Added

Two new values for BenchRunCfg.regression_method: "delta" and "absolute". Each selects a dedicated detector and its threshold comes from a new BenchRunCfg field:
- "delta" uses regression_delta: largest acceptable absolute-unit change of the current run's mean from the mean of all historical per-time means, respecting the result variable's OptDir. Useful when a percent threshold obscures sensitivity at tiny baselines or when CI wants a flat unit ceiling on drift.
- "absolute" uses regression_absolute: hard directional threshold (ceiling for OptDir.minimize, floor for OptDir.maximize) against the current run's mean. No history required — fires on the very first recording.
detect_delta() and detect_absolute() public detectors in bencher.regression, mirroring the detect_percentage / detect_adaptive shape so they participate in the shared plot/report pipeline.
detect_regressions() now runs with a single over_time point when regression_method="absolute", so contractual limits can gate even the initial benchmark run.
Gallery examples example_regression_delta and example_regression_absolute demonstrating the new methods.

Changed

Regression diagnostic plot: when the adaptive detector produces both a MAD band and a percent band, they are now merged into a single combined acceptance band (the union of both — matching the adaptive gate, which flags a regression only when both tests fail). Previously the plot layered two separately-coloured bands.

Assets 2