Process Improvement using Data

A free textbook on the statistics that engineers actually need. Continuously written and refined in industry-facing classrooms since 2010.

Read the book

The book is free to read online and free to download. You do not need this repository to read it:

Read online: https://learnche.org/pid
Download the PDF: PID.pdf

This repository holds the book's source. It is here for people who want to report a problem, contribute a correction, or build the book themselves. See Contributing below.

Why this book exists

There is no other free, coherent text that covers what engineers and scientists actually do with process data (visualization, regression, designed experiments, process monitoring, and multivariate / latent-variable methods) in one volume.

Most textbooks pick one of those topics and go deep. Practitioners need all of them, and need to see how they fit together, because real industrial problems don't respect chapter boundaries. Process Improvement using Data was written to fill that gap, and has been continuously refined in industry-facing classrooms and in industrial practice since 2010.

It is suitable for upper-undergraduate or introductory-graduate courses, and for self-study by working engineers and data scientists with a basic statistics background.

What's inside

Chapter	Topic and what you'll take away
1	Data visualization: how to look at data before you model it.
2	Univariate review: probability, distributions, confidence intervals, and hypothesis tests, refreshed with engineering examples.
3	Process monitoring: Shewhart, CUSUM, and EWMA charts: the toolbox that catches problems before they leave the plant.
4	Least-squares modelling: linear and multiple regression, from first principles through honest diagnostics.
5	Design and analysis of experiments: factorial, fractional factorial, and response-surface designs; learning the most from the fewest runs.
6	Latent variable modelling: PCA, PLS, and batch data analysis: turning high-dimensional process data into actionable insight.
7	Product development and product improvement: combining DOE and latent variable methods to develop new products and improve existing ones.

Companion software: `process-improve`

Every method in this book has a worked, production-grade implementation in the open-source Python package process-improve. It provides PCA and PLS with proper outlier diagnostics and prediction intervals, control charts, designed experiments, and batch process monitoring. Install it with pip install process-improve and run the exercises in any Jupyter notebook.

Who's using this book

The book is adopted in university courses, cited in graduate research, and used inside companies as internal training material. A few course adoptions:

Western University, Canada: required text for the graduate course CBE 9190: Advanced Statistical Process Analysis.
UNSW Sydney, Australia: recommended text for CEIC6789: Data-driven Decision Making in Chemical Engineering and Food Science.
McMaster University, Canada: IBEHS 4C03: Statistical Methods for Biomedical Engineering is built on the book's foundations and adapts them into JupyterLab notebooks.

It is also cited in graduate theses and peer-reviewed research across a range of fields, from chemometrics and semiconductor manufacturing to public health and tribology.

Teaching or training with the book? Tell us via Discussions. We'd be glad to list your course here.

For instructors

You're welcome to use this book, and the course materials below, for your own teaching. Everything is licensed under CC BY-SA 4.0, so you can share, adapt, and even commercialize derivative work as long as you attribute the original and license the result under the same terms. No permission needed.

Course materials live on the original Learning Chemical Engineering: Courses site:

Suggested course structure
PDF slides covering every section of the book
Assignments (with solutions)
Midterms / tests
Final exams
Projects for response surface optimization and design of experiments
A tutorial to learn R
Video recordings of the course on YouTube
Sample datasets for assignments, tests, and practice

Teaching at a company? Ask via GitHub Discussions for additional slides, worksheets, and tips.

Questions, comments, or "how did you make that figure?" enquiries are all welcome there too.

Contributing

Contributions, corrections, and exercises are welcome from anyone: students, instructors, and practitioners alike. The book has been improved continuously since 2010 thanks to readers like you. The fastest channels:

Open an issue for typos, technical errors, broken links, or build problems.
Open a pull request for content changes.
Use Discussions for adoption stories, teaching ideas, and long-form feedback, or this Google Form if you prefer.

CONTRIBUTING.md has everything a contributor needs: the contribution workflow, how to build the book locally, the repository layout, the RST style notes, and how the book is published.

License and citation

The book is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. You are free to copy, adapt, and redistribute it (including for courses you teach) provided you attribute the original author and license your derivative work under the same terms.

Suggested attribution:

Dunn, K. G. (2010–2026). Process Improvement using Data (CC BY-SA 4.0). Zenodo. https://doi.org/10.5281/zenodo.20284934

Each tagged release is archived on Zenodo. The DOI above is the concept DOI: it always resolves to the latest archived edition. Machine-readable citation metadata, including the DOI, is in CITATION.cff.

Privacy and readership data

The HTML edition at https://learnche.org/pid records aggregate, cookieless pageview and search-query signal so the maintainer can tell which sections need attention. No cookies are set, no IP addresses are stored, no third-party trackers are loaded, and the browser Do Not Track setting is honoured. Self-hosted copies of this book do not phone home.

The reader-facing summary lives at https://learnche.org/pid/privacy (source: privacy.rst). The aggregated dashboards (top pages, per-page 90-day sparklines, search queries) are themselves public at https://learnche.org/_stats/ in keeping with the open spirit of the book. Engineering and operations docs are under docs/telemetry/.

Name		Name	Last commit message	Last commit date
Latest commit History 1,040 Commits
.github/workflows		.github/workflows
_static		_static
_templates		_templates
data-visualization		data-visualization
design-analysis-experiments		design-analysis-experiments
docs		docs
latent-variable-modelling		latent-variable-modelling
least-squares-modelling		least-squares-modelling
my-extensions		my-extensions
preface		preface
process-monitoring		process-monitoring
product-development-product-improvement		product-development-product-improvement
scripts		scripts
univariate-review		univariate-review
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
conf.py		conf.py
contents.rst		contents.rst
copy-html.sh		copy-html.sh
justfile		justfile
privacy.rst		privacy.rst
pyproject.toml		pyproject.toml
references-to-add.md		references-to-add.md
start_server.py		start_server.py
stats.rst		stats.rst
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Process Improvement using Data

Read the book

Why this book exists

What's inside

Companion software: `process-improve`

Who's using this book

For instructors

Contributing

License and citation

Privacy and readership data

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Process Improvement using Data

Read the book

Why this book exists

What's inside

Companion software: process-improve

Who's using this book

For instructors

Contributing

License and citation

Privacy and readership data

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages

Companion software: `process-improve`