Uncertainty aware ensemble methods for predictive process monitoring

Scripts for the Masterthesis: "Uncertainty aware ensembles methods for predictive process monitoring"

General:

Master Thesis - Process Mining Codebase. This repository contains code and datasets used for process mining experiments conducted for my master's thesis. It covers event log preprocessing, process model discovery, and evaluation methods.

Installation

Clone the repository:

 git clone https://github.com/HenrikMader/Masterthesis.git

Create virtual environment & activate it

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install requirements.txt
```
pip install -r requirements.txt
```

Usage

Create test and train Split.
- Go to the Folder DataPreparation
- Put the xes log file into the Raw Data for which you want to do the prediction (You will already find Sepsis Cases there right now)
- Run the script (e.g sepsis.py)
- A train and a test file will be created in RawData. Note, that you need to manually create the validation file! We always took the last 15% of the traces from the train dataset.
Run the main script
- Navigate to main.py
- Change the train, test and validation paths.
- Important: When you want to use models which are already trained (e.g from Google Drive), then you need to select them accordingly:
Do this with:
```
cnn_model = torch.load("./path_to_your_model")
```
Instead of this: cnn_model = train(num_epochs_cnn, cnn_model, train_loader, val_loader, learning_rate_cnn)

Tune the Hyperparameters accordingly. Important: With numberOfRuns you can say how often you want the ensembles to produce results across the learning rates. Select 1 if this should only run once Select printing = True if you want to create plots (e.g the plots of the Accuracy Rejection Curves)

Output files

Once the main.py is finished, there are different files created.

train.pt, val.pt and test.pt are the files from the encoding. Encoding with large datasets takes a while (e.g on 2012). You can load them under dataPreparation.py with e.g

instead of: train_dataset = creatingTensorsTrainingAndTesting(train_df, feature_to_index, sliding_window=slidingWindow) say: train:dataset = torch.load("./train.pt)

The models from the ensemble (e.g EnsembleNoUnc). Those files are less important, because the ensembles are training fast
results_lstm_cnn_baseline.txt: In this file you will find all of the MAE results across the different Models and learning rates.

Reading results

You will see on every learning rate different results, from the base Models and from the Ensembles (Average, Regression, Simple MLP and Complex MLP). MLP means neural network for the results. The first Word always suggest which model was running. E.g:

MLP Simple not trained and not given: 7.65

means that the Simple MLP which was not trained on the heteroscedasticity and bayes assumption and the uncertainty was not given ensemble (no, no) was 7.65

MLP Simple trained and not given: xxx

means that the MLP was trained with heteroscedasticity and bayes but the uncertainty Information was not given ensemble (yes, no)

MLP Simple trained and given uncertainty

is the ensemble (yes, yes) Group.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
DataPreparation		DataPreparation
RawData		RawData
.gitignore		.gitignore
dataPreparation.py		dataPreparation.py
encoding.py		encoding.py
main.py		main.py
models.py		models.py
readme.md		readme.md
regression_uncertainties.py		regression_uncertainties.py
requirements.txt		requirements.txt
testing.py		testing.py
training.py		training.py
variation_module.py		variation_module.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uncertainty aware ensemble methods for predictive process monitoring

General:

Table of Contents

Installation

Usage

Output files

Reading results

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Uncertainty aware ensemble methods for predictive process monitoring

General:

Table of Contents

Installation

Usage

Output files

Reading results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages