ML Algorithms From Scratch

This repository contains a growing collection of Machine Learning algorithms implemented entirely from scratch using Python + NumPy.
All implementations include math explanations, clean code, visualizations, and step-by-step notebooks (mostly Kaggle Notebooks).

The goal of this repo is to fully understand how ML works under the hood, without relying on scikit-learn for the core logic.

Algorithm Roadmap

K-Nearest Neighbors
Linear Regression ️
Logistic Regression ️
Naive Bayes ️
Perceptron
Support Vector Machine
Decision Tree
Random Forest
Principal Component Analysis
K-Means Clustering
AdaBoost
Linear Discriminant Analysis

Each algorithm has its own folder containing:

A clean Jupyter notebook (.ipynb)
Math explanation
From-scratch implementation
Visualization (where applicable)
Small README inside the folder

Repository Structure

Machine-Learning-Algorithms-from-scratch/
│
├── KNN/
│   ├── knn.ipynb
│   └── README.md
│
├── LinearRegression/
│   ├── linear_regression.ipynb
│   └── README.md
│
├── LogisticRegression/
│   ├── logistic_regression.ipynb
│   └── README.md
│
├── NaiveBayes/
│   ├── naive_bayes.ipynb
│   └── README.md
│
├── Perceptron/
│   ├── perceptron.ipynb
│   └── README.md
│
├── SVM/
│   ├── svm.ipynb
│   └── README.md
│
├── DecisionTree/
│   ├── decision_tree.ipynb
│   └── README.md
│
├── RandomForest/
│   ├── random_forest.ipynb
│   └── README.md
│
├── PCA/
│   ├── pca.ipynb
│   └── README.md
│
├── KMeans/
│   ├── kmeans.ipynb
│   └── README.md
│
├── AdaBoost/
│   ├── adaboost.ipynb
│   └── README.md
│
├── LDA/
│   ├── lda.ipynb
│   └── README.md
│
└── README.md   ← Main README for the entire project

Learning Philosophy

This project follows one core rule:

Implement all Machine Learning algorithms manually using NumPy only.

Other libraries (like scikit-learn) are used only for:

Generating datasets
Testing models
Comparing results
Avoiding unnecessary boilerplate code

This ensures a strong understanding of:

The math behind each algorithm
Vectorization concepts
Optimization behavior
Model evaluation techniques
Decision boundary intuition
Bias–variance dynamics

Requirements

NumPy — core implementation of all algorithms

Matplotlib — visualizations

Pandas — optional (for loading datasets)

Scikit-learn — only for dataset generation + evaluation

⚠️ Only NumPy is used to implement the algorithms themselves.

How to Run Algorithms

If running as Python modules:

python -m mlfromscratch.

Example:

python -m mlfromscratch.linear_regression

Or simply open the corresponding notebook (recommended):

Kaggle Notebook version

Local Jupyter Notebook version

Why This Repository?

Build strong intuition for Machine Learning

Understand how algorithms work internally

Practice clean, modular, NumPy-based implementations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Algorithms From Scratch

Algorithm Roadmap

Repository Structure

Learning Philosophy

Requirements

How to Run Algorithms

Why This Repository?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
Adaboost		Adaboost
Decision Tree		Decision Tree
K-Means Clustering		K-Means Clustering
KNN		KNN
LDA		LDA
LinearDiscriminantAnalysis		LinearDiscriminantAnalysis
LinearRegression		LinearRegression
LogisticRegression		LogisticRegression
NaiveBayes		NaiveBayes
Perceptron		Perceptron
RandomForest		RandomForest
SVM		SVM
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

ML Algorithms From Scratch

Algorithm Roadmap

Repository Structure

Learning Philosophy

Requirements

How to Run Algorithms

Why This Repository?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages