You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clustering Algorithm Implementation and Visualization from Scratch with Python
Overview
This project implements four popular clustering algorithms from scratch in Python, designed to work for datasets with d >= 2 dimensions and k >= 2 clusters. The implementations are tested on 2D datasets and compared visually with scikit-learn's implementations to evaluate correctness and performance.
Implemented Clustering Algorithms
K-Means Clustering
Gaussian Mixture Model (GMM) using Expectation-Maximization (EM)
Mean-Shift Clustering
Agglomerative Clustering
Python Implementations
KMeans.py: K-Means clustering.
KMeans_Ver0.py: K-Means clustering (2nd version).
GaussianMM.py: EM-GMM.
GaussianMM_Ver0.py: EM-GMM with functions of AIC, BIC and predict (2nd version).
MeanShift.py: Mean-Shift clustering.
Agglomerative.py: Agglomerative clustering.
Evaluations and Tests
test_2d_visualization.py:
Tests each implementation on 2D datasets with visualization, comparing the results to scikit-learn's equivalent algorithms.
data_2d_test/:
Contains the datasets used for testing.
test_2d_visualization_results/:
Stores the output images of the clustering results.