Bayesian regression method for low-noise data that accounts for model misspecification uncertainty.
more information / graphics coming soon to this page!
scikit-learn compatible implementation here
Julia implentation here
Try it out! online demo from Kermode group comparing multiple regression schemes.
Standard Bayesian regression (e.g. BayesianRidge) estimates epistemic and
aleatoric uncertainties, but provably ignore model misspecification- errors arising from limited model form (see example below). In the low-noise (weak aleatoric / near-deterministic) limit, weight uncertainties (sigma_) are significantly underestimated as they only capture epistemic uncertainty, which decays with increasing data. Any remaining error is attributed to aleatoric noise (alpha_), which is erroneous in low-noise settings.
POPSRegression efficiently estimates model misspecification uncertainty
via the Pointwise Optimal Parameter Sets (POPS) algorithm, finidng parameter perturbations that would fit each training point exactly.
The result is wider, more honest uncertainty estimates that properly cover the true function, even when the model class cannot perfectly represent the target.
The misspecified, near-deterministic regression problem that POPSRegression addresses is particularly relevant to the fitting of surrogate simulation models in computational science, i.e. interatomic potentials,where by construction the optimal surrogate model is structurally unable to capture the target function exactly.