- Improve NUTS initialization
advi+adapt_diag_gradand addjitter+adapt_diag_grad(#2643) - Added
MatrixNormalclass for representing vectors of multivariate normal variables - Implemented
HalfStudentTdistribution - New benchmark suite added (see http://pandas.pydata.org/speed/pymc3/)
- Generalized random seed types
- Update loo, new improved algorithm (#2730)
- New CSG (Constant Stochastic Gradient) approximate posterior sampling algorithm (#2544)
- Michael Osthege added support for population-samplers and implemented differential evolution metropolis (
DEMetropolis). For models with correlated dimensions that can not use gradient-based samplers, theDEMetropolissampler can give higher effective sampling rates. (also see PR#2735) - Forestplot supports multiple traces (#2736)
- Add new plot, densityplot (#2741)
- DIC and BPIC calculations have been deprecated
- Fixed
compareplotto useloooutput. - Improved
posteriorplotto scale fonts sample_ppc_wnow broadcastsdf_summaryfunction renamed tosummary- Add test for
model.logp_arrayandmodel.bijection(#2724) - Fixed
sample_ppcandsample_ppc_wto iterate all chains(#2633, #2748) - Add Bayesian R2 score (for GLMs)
stats.r2_score(#2696) and test (#2729). - SMC works with transformed variables (#2755)
- Speedup OPVI (#2759)
- Old (
minibatch-)adviis removed (#2781)
This version includes two major contributions from our Google Summer of Code 2017 students:
- Maxim Kochurov extended and refactored the variational inference module. This primarily adds two important classes, representing operator variational inference (
OPVI) objects andApproximationobjects. These make it easier to extend existingvariationalclasses, and to derive inference fromvariationaloptimizations, respectively. Thevariationalmodule now also includes normalizing flows (NFVI). - Bill Engels added an extensive new Gaussian processes (
gp) module. Standard GPs can be specified using eitherLatentorMarginalclasses, depending on the nature of the underlying function. A Student-T processTPhas been added. In order to accomodate larger datasets, approximate marginal Gaussian processes (MarginalSparse) have been added.
Documentation has been improved as the result of the project's monthly "docathons".
An experimental stochastic gradient Fisher scoring (SGFS) sampling step method has been added.
The API for find_MAP was enhanced.
SMC now estimates the marginal likelihood.
Added Logistic and HalfFlat distributions to set of continuous distributions.
Bayesian fraction of missing information (bfmi) function added to stats.
Enhancements to compareplot added.
QuadPotential adaptation has been implemented.
Script added to build and deploy documentation.
MAP estimates now available for transformed and non-transformed variables.
The Constant variable class has been deprecated, and will be removed in 3.3.
DIC and BPIC calculations have been sped up.
Arrays are now accepted as arguments for the Bound class.
random method was added to the Wishart and LKJCorr distributions.
Progress bars have been added to LOO and WAIC calculations.
All example notebooks updated to reflect changes in API since 3.1.
Parts of the test suite have been refactored.
Fixed sampler stats error in NUTS for non-RAM backends
Matplotlib is no longer a hard dependency, making it easier to use in settings where installing Matplotlib is problematic. PyMC will only complain if plotting is attempted.
Several bugs in the Gaussian process covariance were fixed.
All chains are now used to calculate WAIC and LOO.
AR(1) log-likelihood function has been fixed.
Slice sampler fixed to sample from 1D conditionals.
Several docstring fixes.
The following people contributed to this release (ordered by number of commits):
Maxim Kochurov maxim.v.kochurov@gmail.com Bill Engels w.j.engels@gmail.com Chris Fonnesbeck chris.fonnesbeck@vanderbilt.edu Junpeng Lao junpeng.lao@unifr.ch Adrian Seyboldt adrian.seyboldt@gmail.com AustinRochford arochford@monetate.com Osvaldo Martin aloctavodia@gmail.com Colin Carroll colcarroll@gmail.com Hannes Vasyura-Bathke hannes.bathke@gmx.net Thomas Wiecki thomas.wiecki@gmail.com michaelosthege thecakedev@hotmail.com Marco De Nadai me@marcodena.it Kyle Beauchamp kyleabeauchamp@gmail.com Massimo mcavallaro@users.noreply.github.com ctm22396 ctm22396@gmail.com Max Horn maexlich@gmail.com Hennadii Madan madanh2014@gmail.com Hassan Naseri h.nasseri@gmail.com Peadar Coyle peadarcoyle@googlemail.com Saurav R. Tuladhar saurav@fastmail.com Shashank Shekhar shashank.f1@gmail.com Eric Ma ericmjl@users.noreply.github.com Ed Herbst ed.herbst@gmail.com tsdlovell dlovell@twosigma.com zaxtax zaxtax@users.noreply.github.com Dan Nichol daniel.nichol@univ.ox.ac.uk Benjamin Yetton bdyetton@gmail.com jackhansom jack.hansom@outlook.com Jack Tsai jacksctsai@gmail.com Andrés Asensio Ramos aasensioramos@gmail.com
-
New user forum at http://discourse.pymc.io
-
Much improved variational inference support:
-
Add Stein-Variational Gradient Descent as well as Amortized SVGD (experimental).
-
Added various optimizers including ADAM.
-
Stopping criterion implemented via callbacks.
-
sample() defaults changed: tuning is enabled for the first 500 samples which are then discarded from the trace as burn-in.
-
MvNormal supports Cholesky Decomposition now for increased speed and numerical stability.
-
Many optimizations and speed-ups.
-
NUTS implementation now matches current Stan implementation.
-
Add higher-order integrators for HMC.
-
ADVI stopping criterion implemented.
-
Improved support for theano's floatX setting to enable GPU computations (work in progress).
-
MvNormal supports Cholesky Decomposition now for increased speed and numerical stability.
-
Added support for multidimensional minibatches
-
Added
Approximationclass and the ability to convert a sampled trace into an approximation via itsEmpiricalsubclass. -
Modelcan now be inherited from and act as a base class for user specified models (see pymc3.models.linear). -
Add MvGaussianRandomWalk and MvStudentTRandomWalk distributions.
-
GLM models do not need a left-hand variable anymore.
-
Refactored HMC and NUTS for better readability.
-
Add support for Python 3.6.
-
Bound now works for discrete distributions as well.
-
Random sampling now returns the correct shape even for higher dimensional RVs.
-
Use theano Psi and GammaLn functions to enable GPU support for them.
We are proud and excited to release the first stable version of PyMC3, the product of more than 5 years of ongoing development and contributions from over 80 individuals. PyMC3 is a Python module for Bayesian modeling which focuses on modern Bayesian computational methods, primarily gradient-based (Hamiltonian) MCMC sampling and variational inference. Models are specified in Python, which allows for great flexibility. The main technological difference in PyMC3 relative to previous versions is the reliance on Theano for the computational backend, rather than on Fortran extensions.
Since the beta release last year, the following improvements have been implemented:
-
Added
variationalsubmodule, which features the automatic differentiation variational inference (ADVI) fitting method. Also supports mini-batch ADVI for large data sets. Much of this work was due to the efforts of Taku Yoshioka, and important guidance was provided by the Stan team (specifically Alp Kucukelbir and Daniel Lee). -
Added model checking utility functions, including leave-one-out (LOO) cross-validation, BPIC, WAIC, and DIC.
-
Implemented posterior predictive sampling (
sample_ppc). -
Implemented auto-assignment of step methods by
samplefunction. -
Enhanced IPython Notebook examples, featuring more complete narratives accompanying code.
-
Extensive debugging of NUTS sampler.
-
Updated documentation to reflect changes in code since beta.
-
Refactored test suite for better efficiency.
-
Added von Mises, zero-inflated negative binomial, and Lewandowski, Kurowicka and Joe (LKJ) distributions.
-
Adopted
joblibfor managing parallel computation of chains. -
Added contributor guidelines, contributor code of conduct and governance document.
- Argument order of tau and sd was switched for distributions of the normal family:
Normal()Lognormal()HalfNormal()
Old: Normal(name, mu, tau)
New: Normal(name, mu, sd) (supplying keyword arguments is unaffected).
MvNormalcalling signature changed: Old:MvNormal(name, mu, tau)New:MvNormal(name, mu, cov)(supplying keyword arguments is unaffected).
We on the PyMC3 core team would like to thank everyone for contributing and now feel that this is ready for the big time. We look forward to hearing about all the cool stuff you use PyMC3 for, and look forward to continued development on the package.
The following authors contributed to this release:
Chris Fonnesbeck chris.fonnesbeck@vanderbilt.edu John Salvatier jsalvatier@gmail.com Thomas Wiecki thomas.wiecki@gmail.com Colin Carroll colcarroll@gmail.com Maxim Kochurov maxim.v.kochurov@gmail.com Taku Yoshioka taku.yoshioka.4096@gmail.com Peadar Coyle (springcoil) peadarcoyle@googlemail.com Austin Rochford arochford@monetate.com Osvaldo Martin aloctavodia@gmail.com Shashank Shekhar shashank.f1@gmail.com
In addition, the following community members contributed to this release:
A Kuz for.akuz@gmail.com A. Flaxman abie@alum.mit.edu Abraham Flaxman abie@alum.mit.edu Alexey Goldin alexey.goldin@gmail.com Anand Patil anand.prabhakar.patil@gmail.com Andrea Zonca code@andreazonca.com Andreas Klostermann andreasklostermann@googlemail.com Andres Asensio Ramos Andrew Clegg andrew.clegg@pearson.com Anjum48 Benjamin Edwards bedwards@cs.unm.edu Boris Avdeev borisaqua@gmail.com Brian Naughton briannaughton@gmail.com Byron Smith Chad Heyne chadheyne@gmail.com Corey Farwell coreyf@rwell.org David Huard david.huard@gmail.com David Stück dstuck@users.noreply.github.com DeliciousHair mshepit@gmail.com Dustin Tran Eigenblutwurst Hannes.Bathke@gmx.net Gideon Wulfsohn gideon.wulfsohn@gmail.com Gil Raphaelli g@raphaelli.com Gogs gogitservice@gmail.com Ilan Man Imri Sofer imrisofer@gmail.com Jake Biesinger jake.biesinger@gmail.com James Webber jamestwebber@gmail.com John McDonnell john.v.mcdonnell@gmail.com Jon Sedar jon.sedar@applied.ai Jordi Diaz Jordi Warmenhoven jordi.warmenhoven@gmail.com Karlson Pfannschmidt kiudee@mail.uni-paderborn.de Kyle Bishop citizenphnix@gmail.com Kyle Meyer kyle@kyleam.com Lin Xiao Mack Sweeney mackenzie.sweeney@gmail.com Matthew Emmett memmett@unc.edu Michael Gallaspy gallaspy.michael@gmail.com Nick nalourie@example.com Osvaldo Martin aloctavodia@gmail.com Patricio Benavente patbenavente@gmail.com Raymond Roberts Rodrigo Benenson rodrigo.benenson@gmail.com Sergei Lebedev superbobry@gmail.com Skipper Seabold chris.fonnesbeck@vanderbilt.edu Thomas Kluyver takowl@gmail.com Tobias Knuth mail@tobiasknuth.de Volodymyr Kazantsev Wes McKinney wesmckinn@gmail.com Zach Ploskey zploskey@gmail.com akuz for.akuz@gmail.com brandon willard brandonwillard@gmail.com dstuck dstuck88@gmail.com ingmarschuster ingmar.schuster.linguistics@gmail.com jan-matthis mail@jan-matthis.de jason JasonTam22@gmailcom kiudee quietdeath@gmail.com maahnman github@mm.maahn.de macgyver neil.rabinowitz@merton.ox.ac.uk mwibrow mwibrow@gmail.com olafSmits o.smits@gmail.com paul sorenson paul@metrak.com redst4r redst4r@web.de santon steven.anton@idanalytics.com sgenoud stevegenoud+github@gmail.com stonebig Tal Yarkoni tyarkoni@gmail.com x2apps x2apps@yahoo.com zenourn daniel@zeno.co.nz
Probabilistic programming allows for flexible specification of Bayesian statistical models in code. PyMC3 is a new, open-source probabilistic programmer framework with an intuitive, readable and concise, yet powerful, syntax that is close to the natural notation statisticians use to describe models. It features next-generation fitting techniques, such as the No U-Turn Sampler, that allow fitting complex models with thousands of parameters without specialized knowledge of fitting algorithms.
PyMC3 has recently seen rapid development. With the addition of two new major features: automatic transforms and missing value imputation, PyMC3 has become ready for wider use. PyMC3 is now refined enough that adding features is easy, so we don't expect adding features in the future will require drastic changes. It has also become user friendly enough for a broader audience. Automatic transformations mean NUTS and find_MAP work with less effort, and friendly error messages mean its easy to diagnose problems with your model.
Thus, Thomas, Chris and I are pleased to announce that PyMC3 is now in Beta.
- Transforms now automatically applied to constrained distributions
- Transforms now specified with a
transform=argument on Distributions.model.TransformedVaris gone. - Transparent missing value imputation support added with MaskedArrays or pandas.DataFrame NaNs.
- Bad default values now ignored
- Profile theano functions using
model.profile(model.logpt)
- A. Flaxman abie@alum.mit.edu
- Andrea Zonca code@andreazonca.com
- Andreas Klostermann andreasklostermann@googlemail.com
- Andrew Clegg andrew.clegg@pearson.com
- AustinRochford arochford@monetate.com
- Benjamin Edwards bedwards@cs.unm.edu
- Brian Naughton briannaughton@gmail.com
- Chad Heyne chadheyne@gmail.com
- Chris Fonnesbeck chris.fonnesbeck@vanderbilt.edu
- Corey Farwell coreyf@rwell.org
- John Salvatier jsalvatier@gmail.com
- Karlson Pfannschmidt quietdeath@gmail.com
- Kyle Bishop citizenphnix@gmail.com
- Kyle Meyer kyle@kyleam.com
- Mack Sweeney mackenzie.sweeney@gmail.com
- Osvaldo Martin aloctavodia@gmail.com
- Raymond Roberts rayvroberts@gmail.com
- Rodrigo Benenson rodrigo.benenson@gmail.com
- Thomas Wiecki thomas.wiecki@gmail.com
- Zach Ploskey zploskey@gmail.com
- maahnman github@mm.maahn.de
- paul sorenson paul@metrak.com
- zenourn daniel@zeno.co.nz