Skip to content

mines-opt-ml/jfbr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

283 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JFBR

Configuration & Trials

Configuration lives under cfg/<mode>/.... The <mode> segment is only a folder namespace (e.g., sweep, experiment, sweep_1). It does not affect expansion semantics.

Directory pattern for a leaf config:

cfg/<mode>/<dataset>/<model>/<trainer>/cfg.yaml

Each leaf cfg.yaml is merged with its ancestor cfg.yaml files (walking upward until the cfg/ root) to form a base configuration. Merging is shallow: child keys overwrite parent keys.

Expansion writes resolved per‑trial configurations to:

out/log/<mode>/<dataset>/<model>/<trainer>/trial_###/cfg.yaml

Trials are the Cartesian product of list‑valued hyperparameters, excluding logging lists (currently only batch_metrics). seed may be scalar or list; if a list, it is just another dimension.

Re‑running expansion is idempotent: resolved trial cfg.yaml files are regenerated (so parent edits propagate). A trial is considered complete when batch_log.csv exists in the same trial_### directory.

Summary:

  1. Hierarchical inheritance (leaf overrides parents).
  2. Lists -> sweep axes (except logging lists like batch_metrics).
  3. seed can be scalar or a list (a list becomes an axis like any other).
  4. mode is only a folder namespace and does not affect expansion behavior.
  5. Resolved configs live under out/log/... and drive all training & analysis.

Programmatic expansion + execution:

from src.run import run_trials
# All configs across all datasets
run_trials()
# Only configs under cfg/sweep for mnist
run_trials(datasets=["mnist"], modes=["sweep"])

During execution each trial directory accumulates logs (e.g. batch_log.csv). The dashboard and analysis scripts read directly from out/log.

Execution (uv)

This project is managed with uv. Always invoke Python modules and tests via uv run -m so the correct environment and dependency resolution are used.

Examples:

# Run sweep
uv run -m run.sweep

# Run experiment
uv run -m run.experiment

# Run full test suite
uv run -m pytest

Direct python or executing files as scripts is discouraged; prefer the module form above to ensure imports resolve consistently.

Dashboard

The dashboard provides fast, interactive plots of training metrics across trials. Launch it with:

uv run -m run.dashboard

It serves dashboard/index.html and reads logs directly from out/log/.../trial_###/.

  • Epoch aggregation: If epoch_log.csv is missing for a trial, the server synthesizes it on-demand from batch_log.csv with one row per (epoch, mode). Reductions:

    • loss, acc, and other numeric metrics: mean across batches (unweighted)
    • lr, iter_budget: last value within the epoch
    • grad_norm, error_min/med/max, iter_min/med/max: median within the epoch
    • time: end-of-epoch time (max of batch times)
    • n_batches is included for diagnostics
  • Caching & freshness: The generated epoch_log.csv is reused for subsequent views. If batch_log.csv is newer, the server regenerates epoch_log.csv automatically. Writes are atomic to avoid partial files.

The UI prefers epoch_log.csv (small, fast) and falls back to batch_log.csv if needed. No manual preprocessing required.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors