This model tells us whether a woman of atleast 21yrs old of Indian origin has diabetes or not.
The data set used here is given by National Institute of Diabetes and Digestive and Kidney Diseases
The objective is to predict based on diagnostic measurements whether a patient has diabetes or not.
The factors that are considered are
- Pregnancies: Number of times pregnant
- Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
- BloodPressure: Diastolic blood pressure (mm Hg)
- SkinThickness: Triceps skin fold thickness (mm)
- Insulin: 2-Hour serum insulin (mu U/ml)
- BMI: Body mass index (weight in kg/(height in m)^2)
- DiabetesPedigreeFunction: Diabetes pedigree function
- Age: Age (years)
If the Outcome is 0,then the patient doesn't have Diabetes else the patient has Diabetes
I used supervised machine learning to predict whether a patient has diabetes or not.
K-Nearest Neighbours algorithm is used for the prediction with 5 nearest neighbors.The trained data set size is 80% of the original dataset and the test data set size is 20% of the original dataset.
The accuracy is 70.77% using KNN-algorithm.
The confusion matrix is visualized using heatmap.
Values of
True Positive:29
True Negtaive:80
False Positive:30
False Negative:15
TERMS AND THEIR MEANING:
True Positive:Person has disease and predicted correctly
True Negative:Person doesn't has disease and predicted correctly
False Postive:Person doesn't has disease and predicted wrongly(that is predicted that person has disease)
False Negative:Person has disease and predicted wrongly(that is predicted that person doesn't has disease)