This repository contains my implementations (on Python 3.7) of the algorithms discussed in the aforementioned book "Data Science From Scratch" by Joel Grus.
File name | Python/IPython Notebooks | Description |
---|---|---|
1_Counting_clicker | .py/.ipynb | Count or track how many people have shown up for a class |
2_Visualizing_data | .py/.ipynb | Data visualization using matplotlib library |
3_Vector_operations_on_data | .py/.ipynb | Depicts linear algebra operations on data vectors |
4_Matrix_operations | .py/.ipynb | Depicts creation and manipulation of matrices |
5_Statistics | .py/.ipynb | Stastistical operations to understand the distribution of data |
6_Probability | .py/.ipynb | Understanding the data distribution |
7_Hypothesis_and_Inference | .py/.ipynb | To test whether a certain hypothesis is likely to be true |
8_Gradient_descent | .py/.ipynb | Minimizing the error and estimating unknown parameters using gradient descent on whole dataset/mini-batches |
9_Working_with_data | .py/.ipynb | Basic operations including creation of data histogram, correlation, dictionaries, NamedTuple, classes and rescaling |
10_Principal_component_analysis | .py/.ipynb | Principal component analysis from scratch |
11_machine_learning | .py/.ipynb | Train and test data split, functions to evaluate model's accuracy, precision, recall and F1-score |
12_k-Nearest-Neighbors | .py/.ipynb | Implemention of k-nearest neighbors algorithm from scratch in Python |
13_Naive_Bayes | .py/.ipynb | Naive Bayes classifier from scratch to identify words belonging to spam and not spam (ham) emails |
14_Linear_Regression | .py/.ipynb | Linear regression from scratch using closed form solution and stochastic gradient descent |
15_Multiple_Regression | .py/.ipynb | Multiple regression from scratch using stochastic gradient descent, compute statistics in bootstrap manner, ridge and lasso regularization |
16_Logistic_Regression | .py/.ipynb | Logistic regression from scratch and compute precision and recall on testing data |
17_Decision_Trees | .py/.ipynb | Decision Trees using ID3 learning algorithm from scratch |
18_Neural_networks | .py/.ipynb | Neural network (including feed-forward and backpropagation) from scratch. An interesting "fizzbuzz" example is also shown to train and test the neural network |
19_Deep_Learning | .py/.ipynb | Implementation of deep neural networks with various loss functions, optimization techniques, network regularization using dropout from scratch. Training of deep neural networks on Fizzbuzz and MNIST data. |
20_Clustering | .py/.ipynb | Implementation of k-means and bottom-up hierarchical clustering from scratch. |
21_nlp | .py/.ipynb | Implementation of popular natural language processing algorithms including bigrams, trigrams, topic modeling, word vectors and recurrent neural networks from scratch in Python. |