Skip to content

dkotzamp/ML-project-MSc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Final project in Methods in Bioinformatics

For this project, the Heart Failure Clinical Records dataset (source) and the Mice Protein Expression dataset (source) were selected for analysis. The first dataset contains clinical and epidemiological records of 299 patients who experienced heart failure, including their clinicopathological characteristics. The second dataset comprises expression levels of 77 proteins measured in the cerebral cortex of eight classes of mice, including control (n=38) and Down syndrome (n=34) groups, exposed to context fear conditioning.

The primary objectives of this study were:

  1. Implementation of a classification pipeline
  2. Comprehensive data analysis and result interpretation

The classification pipeline consisted of the following stages: i) Preprocessing, including handling of missing values (if applicable) and other relevant transformations. ii) Model comparison, where different classification algorithms were evaluated. iii) Optimization, involving hyperparameter tuning to enhance the performance of the selected model.

The classification algorithms utilized in this study included Decision Trees, k-Nearest Neighbors (k-NN), Support Vector Machines (SVM), Stochastic Gradient Descent (SGD), Logistic Regression, and Artificial Neural Networks (ANNs).

For data analysis, appropriate visualization techniques and machine learning (ML) methods were applied to facilitate interpretation. The results were also visualized to enhance their comprehensibility.

In conclusion, this project presents both theoretical and practical significance, addressing key challenges associated with the analysis of real-world biological data.

About

MLproject_MSc

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published