Skip to content

UofT-DSI/production

Repository files navigation

Production

Content

Description

This module provides an overview of the Design of Machine Learning Systems embedded in data-intensive products and applications. It covers the fundamental components of the infrastructure, systems, and methods necessary to implement and maintain Machine Learning (ML) models in production. In short, we will learn techniques for building an ML model factory.

The module has two components:

  • A discussion of the main issues and challenges faced in production, together with some approaches to address them.
  • A live lab with demonstrations of implementation techniques.

The module covers the following areas:

  • Data engineering.
  • Feature engineering.
  • Hyperparameter tuning.
  • Model deployment.
  • Model explainability.
  • Logging, experiment tracking, and monitoring.

We will discuss the tools and techniques required to do the above in good order and at scale. However, we will not discuss the inner workings of models, advantages, and so on. We will also not discuss the theoretical aspects of feature engineering or hyperparameter tuning. We will focus on tools and reproducibility.

This module follows the contents of Desinging Machine Learning Systems, by Chip Huyen.

Learning Outcomes

By the end of this module, participants will be able to:

  • Describe the main components of a machine learning system.
  • Explain the infrastructure required to train and test models in production.
  • Implement an experiment tracking system and logging.
  • Contrast and evaluate different approaches to storing and manipulating data.
  • Design data flows and processes to automate the construction of ML models.

Contacts

Questions can be submitted to the #cohort-8-help channel on Slack.

Delivery of the Learning Module

This module will include live learning sessions and optional, asynchronous work periods. During live learning sessions, the Technical Facilitator will introduce and explain key concepts and demonstrate core skills. Learning is facilitated during this time. Before and after each live learning session, the instructional team will be available to answer questions about the module's core concepts. Optional work periods are to be used to seek help from peers, the Learning Support team, and to work through the homework and assignments in the learning module, with access to live help. Content is not facilitated, but rather, this time should be driven by participants. We encourage participants to come to these work periods with questions and problems to work through.

Participants are encouraged to engage actively during the learning module. The key to developing the core skills in each learning module is through practice. The more participants engage in coding alongside the instructional team and apply these skills in each module, the more likely they are to solidify them.

Schedule

Session Date         Topic                            
 1   Tue., Jan. 13, 2026     ML System Design                
 2   Wed., Jan. 14, 2026     Data Engineering Fundamentals    
 3   Thur., Jan. 15, 2026     Working with Training Data      
 --   Fri., Jan. 16, 2026     Work Period  
 --   Sat., Jan. 17, 2026     Work Period  
--   Sun., Jan. 18, 2026         Submission deadline for Quizzes 1-3
--   Mon., Jan. 19, 2026         Submission deadline for Assignment 1
 4   Tue., Jan. 20, 2026     Feature Engineering              
 5   Wed., Jan. 21, 2026     Model Development and Evaluation
 6   Thur., Jan. 22, 2026     Model Explanations and Monitoring
 --   Fri., Jan. 23, 2026     Work Period  
 --   Sat., Jan. 24, 2026     Work Period  
 --   Sun., Jan. 25, 2026     Submission deadline for Quizzes 4-6
 --   Mon., Jan. 26, 2026     Submission deadline for Assignment 2

Requirements

  • Participants are expected to have completed Shell, Git, and Python learning modules.
  • Participants are encouraged to ask questions and collaborate with others to enhance their learning experience.
  • Participants must have a computer and an internet connection to participate in online activities.
  • Participants must have VSCode installed with the following extensions:   - Jupyter   - Python
  • Participants must install Docker as this module implements a Docker backend that will run a PostgreSQL server. This is intended to mimic a production-like environment. Participants may use SQLite if Docker is not an option.
  • Participants must not use generative AI such as ChatGPT to generate code in order to complete assignments. It should be used as a support tool to help you find answers to questions you may have.
  • We expect participants to have completed the steps in the onboarding repo.
  • We encourage participants to default to having their camera on at all times, and turning the camera off only as needed. This will greatly enhance the learning experience for all participants and provide real-time feedback for the instructional team.

Assessment

Your performance on this module will be assessed using six quizzes and two assignments.

Quizzes

Quizzes will help you build key concepts in data science, data engineering, and machine learning engineering. Historically, learners take 5-10 minutes to complete each quiz, achieving an average score of +80%.

  • Each quiz will contain material from each live learning session.
  • You will receive a link to each quiz during the respective live learning session. The links are personalized; please do not share them. If you did not receive a link, contact any member of the course delivery team. Each quiz will contain approximately 10 questions of various types, including true/false, multiple-choice, and simple selection.
  • All quizzes are mandatory and should be submitted by their due date.
  • The quizzes will remain open until their respective due dates, after which you will not have access to them.

Assignments

Assignments will help you develop coding and debugging skills. They will cover foundational skills and will extend to advanced concepts. We recommend that you attempt all assignments and submit your work, even if it is incomplete (partial submissions will earn partial marks).

  • Each assignment should be submitted using the usual method in DSI via a Pull Request.

  • The assignments and their respective rubrics are:

Grades

All participants will receive a pass or fail mark. For this course, a score of 60 points is required to receive a "pass" mark. The score will be determined as follows:

  • Quizzes' average score - 60%
  • Assignment 1 - 20%
  • Assignment 2 - 20%

Assignments' assessment can be transformed into a numeric grade using:

  • Complete - 100 points
  • Incomplete / Partially Complete - 50 points
  • Missing / Not submitted - 0 points

For example, a learner with the following grades would receive "pass":

  • Quizzes 80
  • Assignment 1 - Complete (100)
  • Assignment 2 - Incomplete (50)
  • (0.6 * 80) + (0.2 * 100) + (0.2 * 50) = 48 + 20 + 10 = 78 > 60

A different learner with grades as shown below would receive "fail":

  • Quizzes 80
  • Assignment 1 - Incomplete (50)
  • Assignment 2 - Missing (0)
  • (0.6 * 80) + (0.2 * 50) + 0 = 48 + 10 + 0 = 58 < 60

Resources

ML Models

Books that mainly discuss learning methods, their applications, and limitations.

ML Engineering

Books that discuss how to put learning methods in production, including training, deployment, monitoring. As well as more architecture-oriented references.

  • Burkov. Machine Learning Engineering. Similar to Burkov's book above, but for ML Engineering.
  • Khun and Silge. Tidy Modelling with R. The book implements its examples in R, but the discussion about models, evaluation techniques, and pipelines is highly worthwhile. Both, Julia Silge and Max Khun, are brilliant data scientists and great communicators.
  • Kleppmann. Designing Data-Intensive Applications. A great in-depth resource about the techniques for building data-intensive applications.

Advanced Topics

References for specific topics like Feature Engineering, Conformal Prediction, and Model Interpretability.

Online Resources and Repositories

Videos

Folder Structure

.
├── .github
├── 01_materials
├── 02_activities
├── 03_instructional_team
├── 04_this_cohort
├── 05_src
├── .gitignore
├── LICENSE
├── SETUP.md
├── pyproject.toml
└── README.md
  • .github: Contains issue templates and pull request templates for the repository.
  • materials: Module slides and interactive notebooks (.ipynb files) used during learning sessions.
  • activities: Contains graded assignments, exercises, and homework to practice concepts covered in the learning module.
  • instructional_team: Resources for the instructional team.
  • this_cohort: Additional materials and resources for cohort three.
  • src: Source code, databases, logs, and required dependencies (requirements.txt) needed during the module.
  • .gitignore: Files to exclude from this folder, specified by the Technical Facilitator
  • LICENSE: The license for this repository.
  • SETUP.md: Contains the steps required to set up this repo for the module.
  • pyproject.toml: Tells Python which packages this repo needs to run.
  • README.md: This file.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published