Quick start • Reproduce • Safety • Schedules • Cite
A simple, safe “negative regularization” that boosts expressivity when data are scarce, then fades out with a sample-size–dependent decay. Works with linear models and shallow MLPs, with stability safeguards.
- Paper: Convergence and Generalization of Anti-regularization for Parametric Models (https://arxiv.org/abs/2508.17412)
- Core idea: add a sign-reversed reward term early to reduce underfitting, then decay via a power-law schedule so training converges back to standard ERM as data grow. Stability is ensured by a projection (trust-region) + gradient clipping safeguard.
- Small-sample: AR increases effective DoF slightly to fix underfitting.
- As n grows: the decay schedule shrinks λ → 0, recovering the baseline without hurting generalization.
- Safety: spectral/trust-region safety condition + clipping prevent divergence; we also log output-scale ratio and clipping/projection rates.
# 1) clone
git clone https://github.com/AndrewKim1997/anti-regularization-parametric-models.git
cd anti-regularization-parametric-models
# 2) Python deps (3.10+)
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# (optional) editable install
# pip install -e .Docker (optional)
docker compose up --build.github/workflows/ci.yml # unit tests & config checks on push/PR
configs/ # ready-to-run YAMLs (experiments & ablations)
data/ # (auto)downloaded datasets live under data/raw/*
docker/ # docker-compose.yml for CPU/CUDA images
experiments/ # per-run folders with metrics & logs
results/ # collected CSVs (paper tables)
src/ar/ # library code & main entrypoint (src.ar.run)
tests/ # unit tests
LICENSE, CITATION.cff, pyproject.toml, requirements.txt, README.md
All runs use the module entrypoint:
python -m src.ar.run --config <path/to/config.yaml># UCI Concrete
python -m src.ar.run --config configs/concrete_reg.yaml
# UCI Airfoil
python -m src.ar.run --config configs/airfoil_reg.yaml# MNIST
python -m src.ar.run --config configs/mnist_cls.yaml
# CIFAR-10
python -m src.ar.run --config configs/cifar10_cls.yaml⏱️ 1-minute sanity (small slice)
python -m src.ar.run --config configs/mnist_cls.yaml \
--only_seed 0 --only_optimizer adam --add_baseline_zero-
All experiment settings live in
configs/*.yaml(dataset, model, optimizer, λ schedule, logging). -
Useful CLI flags:
-
--only_seed,--only_fraction,--only_optimizer -
--add_baseline_zero– evaluate λ=0 baseline under the same condition -
Ablations:
no_trust_region(disable projection safeguard)no_grad_clip(disable gradient clipping)l2(replace AR with positive L2 / Tikhonov regularization)constant_lambda(no decay; fixed λ = λ0)
-
Configuration knobs (click to expand)
- λ0 grid, α power-law decay (sample-size–dependent)
- Safety: projection operator (trust-region radius), gradient clipping
- Diagnostics: output-scale ratio (ρ), clipping rate (r_clip), projection rate (r_proj)
AR is wrapped with lightweight safeguards:
- Projection (trust-region constraint): if a step exceeds the trust-region radius, we project parameters back (a projection operator).
- Gradient clipping: classic clipping to prevent exploding updates.
Logged diagnostics (per run):
- output-scale ratio (ρ) – AR vs baseline output-norm ratio
- clipping rate (r_clip) – fraction of updates affected by clipping
- projection rate (r_proj) – fraction of updates corrected by projection
Use these to verify safety (e.g., keep ρ near 1 and moderate r_clip / r_proj).
-
Power-law decay schedule:
|λ(n)| = |λ(n0)| (n0/n)^α- Regression: set α ≈ 1
- Classification: set α ≈ 0.5 (conservative ≥ 0.5)
-
(Optional) DoF targeting: keep per-sample complexity roughly constant by controlling
tr(S_λ)/n.
These heuristics balance bias–variance and help ensure convergence and stable generalization.
Run the four main configs to reproduce headline results:
python -m src.ar.run --config configs/concrete_reg.yaml
python -m src.ar.run --config configs/airfoil_reg.yaml
python -m src.ar.run --config configs/mnist_cls.yaml
python -m src.ar.run --config configs/cifar10_cls.yaml-
Outputs:
experiments/<run-id>/...and aggregated CSVs inresults/*.csv. -
Datasets:
- UCI Concrete:
data/raw/concrete/Concrete_Data.xls - UCI Airfoil:
data/raw/airfoil/airfoil_self_noise.dat - MNIST / CIFAR-10: auto-downloaded via torchvision
- UCI Concrete:
pytest -qCI validates loaders, config schemas, and logging. PRs must pass CI.
Issues and PRs are welcome. Suggested contributions: new decay schedules, optimizer studies, safety diagnostics, or additional datasets.
@article{kim2025convergence,
title={Convergence and Generalization of Anti-Regularization for Parametric Models},
author={Kim, Dongseok and Jeong, Wonjun and Oh, Gisung},
journal={arXiv preprint arXiv:2508.17412},
year={2025}
}
Also see CITATION.cff in this repository.
This project is released under the terms of the license in LICENSE.
