Skip to content

busera/rai__diabetes_analysis

Repository files navigation


PROJECT IN PROGRESS

Exploring Microsoft Responsible AI Toolkit with Diabetes Dataset

Project Description

This project applies the Microsoft Responsible AI toolkit to the scikit-learn Diabetes dataset, focusing on gaining practical experience with the toolkit and understanding its metrics in a real-world machine learning context. Key components explored include fairness assessment, interpretability techniques, and error analysis. The project aims to demonstrate responsible AI practices in healthcare-related machine learning, potentially improving model transparency and reducing bias in medical predictions.

Dataset Description

This project utilizes the Diabetes dataset from scikit-learn, a well-established dataset in the machine learning community. Here are key details about the dataset:

  • Source: The dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases.
  • Target: The target variable is a quantitative measure of disease progression one year after baseline.
  • Features: The dataset contains 10 baseline variables:
    • Age
    • Sex
    • Body Mass Index (BMI)
    • Average Blood Pressure
    • Six blood serum measurements
  • Samples: It contains 442 samples.
  • Task: The primary task is regression, predicting the quantitative measure of disease progression.
  • Ethical Considerations: As this dataset relates to health information, it's crucial to consider privacy and fairness implications in its use and analysis.

This dataset provides a realistic scenario for applying responsible AI practices, as healthcare applications often involve sensitive data and have significant real-world impacts.

Features

  • Utilizes the Microsoft Responsible AI toolkit for comprehensive model analysis
  • Applies responsible AI practices to the scikit-learn Diabetes dataset
  • Explores fairness assessment in healthcare-related machine learning
  • Implements interpretability techniques for model transparency
  • Conducts error analysis to improve model performance

Preparation and Installation

Preparing Conda Environment

conda env create -f environment.yaml
conda activate diabetes

In case the pip packages were not installed during the creation of the conda environment, just execute this command to verify the installation:

pip install raiwidgets interpret-community cookiecutter

About

Exploring and evaluating Microsoft's Responsible AI toolkit using the scikit-learn Diabetes dataset — practical experience in ethical machine learning practices, fairness assessment, interpretability, and error analysis.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors