Minimal working examples of ML experiment configuration using Hydra.
| Directory | Description | Quick start |
|---|---|---|
0_sklearn/ |
Scikit-learn classifiers (RandomForest, MLP, SVM) on toy datasets | uv run 0_sklearn/script.py |
1_pytorch/ |
PyTorch models (MLP, CNN, ViT) on image datasets (MNIST, SVHN) | uv run 1_pytorch/script.py |
2_optuna/ |
Bayesian HPO with Optuna sweeper plugin | uv run 2_optuna/script.py --multirun |
The defaults list in main.yaml composes a final config from modular YAML files in subdirectories:
# 0_sklearn/config/main.yaml
defaults:
- model: randomforest
- dataset: blobs
- _self_
seed: 42
test_size: 0.3Selecting model=mlp at the command line swaps in config/model/mlp.yaml instead of randomforest.yaml — no code changes needed.
OmegaConf resolvers let configs reference each other, ensuring a single source of truth:
# 0_sklearn/config/model/randomforest.yaml
random_state: ${seed} # resolves 'seed' from main.yaml
# 1_pytorch/config/model/mlp.yaml
im_size: ${dataset.im_size} # cross-config reference to dataset properties
num_classes: ${dataset.num_classes}Configs can specify a _target_ to directly instantiate classes and functions from config, removing the need for boilerplate code to propagate configuration to backing classes/functions:
# 0_sklearn/config/model/randomforest.yaml
_target_: sklearn.ensemble.RandomForestClassifier
n_estimators: 100
random_state: ${seed}# 0_sklearn/script.py
model = hydra.utils.instantiate(cfg.model) # returns a RandomForestClassifier
X, y = hydra.utils.call(cfg.dataset) # calls sklearn.datasets.make_blobs(...)Requires uv. Run from the repo root:
uv run 0_sklearn/script.pyLogs go to /tmp/hydra/logs/.
Override config groups or individual parameters from the command line:
# Select a different model and dataset
uv run 0_sklearn/script.py model=mlp dataset=moons
# Override a specific parameter
uv run 0_sklearn/script.py model=randomforest model.n_estimators=400
# Sweep over all combinations
uv run 0_sklearn/script.py --multirun model=randomforest,mlp,svm dataset=blobs,circles,moonsAny parameter supported by the backing class can be modified from the command line. For parameters not explicitly listed in the YAML config, use append syntax1:
uv run 0_sklearn/script.py --multirun model=mlp +model.momentum=0.5,0.7,0.9The PyTorch example adds optimizer config groups and cross-config interpolation (model configs reference dataset properties like im_size and num_classes):
# Default: MLP on MNIST with Adam
uv run 1_pytorch/script.py
# Override model, optimizer, and learning rate
uv run 1_pytorch/script.py model=cnn optimizer=sgd optimizer.lr=1e-4
# Sweep over models, optimizers, and learning rates
uv run 1_pytorch/script.py --multirun model=mlp,cnn,vit optimizer=adam,sgd optimizer.lr=1e-5,1e-4,1e-3See 1_pytorch/README.md for model details.
The 2_optuna/ example replaces Hydra's default grid search with intelligent Bayesian optimization via the Optuna sweeper plugin:
# Default: tune MLP hyperparameters
uv run 2_optuna/script.py --multirun
# Cross-model HPO with conditional per-model search spaces
uv run 2_optuna/script.py --multirun hpo=cross_model
# Evaluate: reload best params from a previous sweep
uv run 2_optuna/script.py evaluate=true best_params=/tmp/hydra/logs/script/2025-01-01/12-00-00/optimization_results.yamlSee 2_optuna/README.md for more details.