The following code uses machine learning to identify patterns in data from The Cancer Genome Atlas (TCGA) from the National Institute of Health (NIH). Specifically, I wrote a k-means algorithm which identifies subtypes of patients with pancreatic cancer. The algorithm was written for my Machine Learning course without using the standard sklearn package to demonstrate my knowledge of the algorithm. In later analysis, I used a Kruskal-Wallis test to identify differentially expressed genes between the various subtypes.
beneopp/TCGA-Analysis
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|