This repository hosts the source code for the project titled "Text Classification for the Automation of Article Selection on the Effectiveness of Medical Interventions." This project was developed by Samuel Martey Okoe-Mensah as part of a Master of Linguistics thesis at the University of Bergen. It aims to automate the selection of medical articles by categorizing them as relevant or less relevant to systematic reviews, particularly focusing on interventions for dementia and cognitive decline.
The thesis explores machine learning models for automated text classification within PubMed and PubMed Central databases, aiming to streamline the systematic review process used in medical research. By employing advanced text classification strategies, the project assists medical professionals by enhancing the efficiency of literature reviews.
Main Classify Abstracts Code.ipynb: Jupyter notebook for the main classification model.Ontology Preferred Label Groupings.ipynb: Jupyter notebook for handling ontology-based feature enrichment.classify_abstracts_new.py: Python script for the classification process.
- Clone the repository:
git clone https://github.com/Tm-ui/Medical-intervention-text-classification.git - Install required libraries:
pip install -r requirements.txt - Run the Jupyter notebooks or Python script as needed to replicate the study results or to apply the methodologies to new datasets.
- Python 3.8+
- Pandas
- NumPy
- Scikit-learn
- TensorFlow
- NLTK
For any queries related to the code or the methodologies used in this project, feel free to reach out:
- Email: sammy.okmens@gmail.com
- LinkedIn: Samuel M. Okoe-Mensah