This GitHub repository aims to exemplify the data preprocessing and feature extraction used in the paper of Oonk et al (2025). Note that this code is purely to illustrate how the features were extracted. A more optimized software to load, preprocess, synchronize, and analyze soccer tracking and event data can be found in the databallpy package.
The main file in this pacakge is the notebooks. The first notebook is the preprocessing and feature extraction notebook. It illustrates the full process of data preprocessing and feature extraction. All the code that is used here is either via open source packages or provided in the scripts folder. Note that the data used for this example is from an open sourced tracking data dataset, which can be found here, since we are not allowed to share the used data of this project. In this dataset we miss the event data. The second notebook is for the Multivariate Analysis. Thirdly, the training and performance of the classifiers can be found in the Training the Models notebook.
If you have any questions regarding the project, you can reach out to g.a.oonk@umcg.nl, be also sure to checkout out the databallpy package.