Releases: blooop/bencher
Releases · blooop/bencher
v1.98.0
Added
aggregate,agg_fn, andrepeatsparameters foroptimize(), matching theplot_sweep()API. Aggregated dimensions are looped inside the Optuna objective so the optimizer sees robust metrics (e.g. mean loss across seeds or repeated boolean outcomes).AGG_FN_MAPinbencher/utils.py— NaN-safe numpy aggregation functions for objective-level aggregation.- Example
example_optimize_aggregate.pydemonstrating sweep-then-optimize with dimension aggregation and repeats.
Fixed
- Missing
skipna=TrueonREDUCEandMINMAXrepeat aggregation inbench_result_base.py. np.mean→np.nanmeaninoptuna_result.pyaggregation to match xarray's NaN-safe behavior.
v1.97.0
Fixed
aggregate=Trueno longer duplicates pane-type results (rerun, image, video). Pane results store file paths that cannot be numerically aggregated, so they now only render in the non-aggregated view.- Line plotter crash when aggregating:
plt_cnt_cfgstill referenced collapsed dimensions, causing holoviewsDataErroron missing dimension names. Swapped to post-aggregation config duringmap_plot_panescalls. remove_plotsno longer raisesValueErrorwhen combined withnumeric_only.
Changed
- Renamed
VideoResulttoPaneResultto reflect that it handles all pane types (rerun, image, video), not just video.
Added
- Image and video aggregate examples (
example_result_image_aggregate,example_result_video_aggregate) to exercise and demonstrate pane-result aggregation. omega_nsweep added toControlSystemSweepfor multi-input rerun testing.
v1.96.0
Fixed
- Rerun viewer panes now work in saved HTML reports (
show="html"/ShowMode.HTML). Previously the viewer failed because browsers blockfetch()fromfile://origins. The.rrddata is now base64-encoded directly into the viewer HTML page and loaded via the rerunopen_channel()/send_rrd()API, bypassing the fetch entirely. - Multi-tab reports with rerun panes: tab files in
_tabs/now correctly reference../_rrd/instead of_rrd/, fixing broken relative paths.
v1.95.0
Fixed
- Rerun viewer panes now work in saved HTML reports (
show="html"/ShowMode.HTML). Previously the viewer failed because browsers blockfetch()fromfile://origins. The.rrddata is now base64-encoded directly into the viewer HTML page and loaded via the rerunopen_channel()/send_rrd()API, bypassing the fetch entirely. - Multi-tab reports with rerun panes: tab files in
_tabs/now correctly reference../_rrd/instead of_rrd/, fixing broken relative paths.
v1.94.0
Fixed
- Rerun viewer panes now work in saved HTML reports (
show="html"/ShowMode.HTML). Previously the viewer failed because browsers blockfetch()fromfile://origins. The.rrddata is now base64-encoded directly into the viewer HTML page and loaded via the rerunopen_channel()/send_rrd()API, bypassing the fetch entirely. - Multi-tab reports with rerun panes: tab files in
_tabs/now correctly reference../_rrd/instead of_rrd/, fixing broken relative paths.
v1.93.0
Added
ShowModeStrEnum (live,html,published,none) exported from the top-levelbencherpackage.bn.run(show=bn.ShowMode.HTML)gives autocomplete and typo detection while plain strings (show="html") and booleans (show=True) keep working. The old"static"spelling is accepted as an alias forShowMode.HTML.
Changed
- The
showparameter onbn.run(),BenchRunner.run(),BenchRunner.show(), andBenchPlotSrvCfgnow acceptsShowModein addition tobool | str. - Renamed the
"static"display mode to"html"("static"remains supported via alias).
v1.92.0
Added
- Public
MethodCellsdataclass andmethod_cells(result)helper inbencher.regression, re-exported from the top-levelbencherpackage. Downstream report builders can now callmethod_cells(r)to get pre-rendered, method-aware display strings (change, baseline, threshold, summary lead) for aRegressionResultand embed them in a custom layout — custom columns, non-markdown output, CI comments with status decoration, etc. — without reimplementing per-method dispatch (and drifting when new detection methods are added).
Removed
- The private names
_MethodCells/_method_cellsare gone. Update callers to the publicMethodCells/method_cells.
v1.91.0
Added
- Regression report is now auto-embedded as a Markdown panel at the top of
to_auto_plots()wheneverregression_report.has_regressionsis true. Previously only the per-variable overlay plots were injected, so absolute-method fires (which have no history/overlay) were silent in the report.
Changed
- Regression report rendering (
RegressionReport.summary()andto_markdown()) now dispatches per method so each row describes its actual gate:percentage: threshold shown as±T%.adaptive: threshold shown asTσ(change remains in percent).delta: Change column shows the raw Δ (not percent, since the gate is in absolute units); threshold rendered as±T.absolute: Change and Baseline cells rendered as em-dash (no historical baseline); Threshold cell carries the direction-aware inequality (≤ LforOptDir.minimize,≥ LforOptDir.maximize). Summary line phrased ascurrent=X vs ceiling|floor=Y.
Fixed
RegressionResult.summary()/RegressionReport.to_markdown()no longer render+nan%or mislabel the hard limit asBaselineforregression_method="absolute"results.
v1.90.0
Added
sampling_contextparameter onbn.run(): an optional context manager that wraps only the sampling phase. Its__exit__is guaranteed to run before the Panel/Bokeh server starts, so external resources (DB pools, GPU handles, simulators) are released while nothing blocks.saveandpublishstill execute inside the context. Defaults toNone(fully backward-compatible).
v1.89.0
Added
- Two new values for
BenchRunCfg.regression_method:"delta"and"absolute". Each selects a dedicated detector and its threshold comes from a newBenchRunCfgfield:"delta"usesregression_delta: largest acceptable absolute-unit change of the current run's mean from the mean of all historical per-time means, respecting the result variable'sOptDir. Useful when a percent threshold obscures sensitivity at tiny baselines or when CI wants a flat unit ceiling on drift."absolute"usesregression_absolute: hard directional threshold (ceiling forOptDir.minimize, floor forOptDir.maximize) against the current run's mean. No history required — fires on the very first recording.
detect_delta()anddetect_absolute()public detectors inbencher.regression, mirroring thedetect_percentage/detect_adaptiveshape so they participate in the shared plot/report pipeline.detect_regressions()now runs with a singleover_timepoint whenregression_method="absolute", so contractual limits can gate even the initial benchmark run.- Gallery examples
example_regression_deltaandexample_regression_absolutedemonstrating the new methods.
Changed
- Regression diagnostic plot: when the adaptive detector produces both a MAD band and a percent band, they are now merged into a single combined acceptance band (the union of both — matching the adaptive gate, which flags a regression only when both tests fail). Previously the plot layered two separately-coloured bands.