This project is part of Digicamp 2025, co-teached alongside Amedeo Zappulla, where students are introduced to the fundamentals of data science applied to sports performance.
The focus is on race data analysis, teaching students how to clean, visualize, and interpret performance trends, and finally build a basic predictive model.
By the end of this course, students were able to:
- Understand the role of data analytics in sports.
- Perform data cleaning and preprocessing on raw race data.
- Create meaningful visualizations to highlight performance trends.
- Apply basic statistical and machine learning techniques to predict outcomes.
- Communicate insights effectively using plots and metrics.
We worked with track & field race results (e.g., 100m sprint times). Students explored questions such as:
- How has an athlete’s performance changed over time?
- What trends can we identify in seasonal performance?
- Can we predict the outcome of a future race using past results?
- Python
- Pandas – data manipulation
- Matplotlib / Seaborn – visualization
- Scikit-learn – building simple predictive models
- Jupyter Notebooks – interactive analysis