Skip to content

Latest commit

 

History

History
26 lines (25 loc) · 7.11 KB

File metadata and controls

26 lines (25 loc) · 7.11 KB

Data-Science-from-Scratch

This repository contains my implementations (on Python 3.7) of the algorithms discussed in the aforementioned book "Data Science From Scratch" by Joel Grus.

File name Python/IPython Notebooks Description
1_Counting_clicker .py/.ipynb Count or track how many people have shown up for a class
2_Visualizing_data .py/.ipynb Data visualization using matplotlib library
3_Vector_operations_on_data .py/.ipynb Depicts linear algebra operations on data vectors
4_Matrix_operations .py/.ipynb Depicts creation and manipulation of matrices
5_Statistics .py/.ipynb Stastistical operations to understand the distribution of data
6_Probability .py/.ipynb Understanding the data distribution
7_Hypothesis_and_Inference .py/.ipynb To test whether a certain hypothesis is likely to be true
8_Gradient_descent .py/.ipynb Minimizing the error and estimating unknown parameters using gradient descent on whole dataset/mini-batches
9_Working_with_data .py/.ipynb Basic operations including creation of data histogram, correlation, dictionaries, NamedTuple, classes and rescaling
10_Principal_component_analysis .py/.ipynb Principal component analysis from scratch
11_machine_learning .py/.ipynb Train and test data split, functions to evaluate model's accuracy, precision, recall and F1-score
12_k-Nearest-Neighbors .py/.ipynb Implemention of k-nearest neighbors algorithm from scratch in Python
13_Naive_Bayes .py/.ipynb Naive Bayes classifier from scratch to identify words belonging to spam and not spam (ham) emails
14_Linear_Regression .py/.ipynb Linear regression from scratch using closed form solution and stochastic gradient descent
15_Multiple_Regression .py/.ipynb Multiple regression from scratch using stochastic gradient descent, compute statistics in bootstrap manner, ridge and lasso regularization
16_Logistic_Regression .py/.ipynb Logistic regression from scratch and compute precision and recall on testing data
17_Decision_Trees .py/.ipynb Decision Trees using ID3 learning algorithm from scratch
18_Neural_networks .py/.ipynb Neural network (including feed-forward and backpropagation) from scratch. An interesting "fizzbuzz" example is also shown to train and test the neural network
19_Deep_Learning .py/.ipynb Implementation of deep neural networks with various loss functions, optimization techniques, network regularization using dropout from scratch. Training of deep neural networks on Fizzbuzz and MNIST data.
20_Clustering .py/.ipynb Implementation of k-means and bottom-up hierarchical clustering from scratch.
21_nlp .py/.ipynb Implementation of popular natural language processing algorithms including bigrams, trigrams, topic modeling, word vectors and recurrent neural networks from scratch in Python.