Finding Optimum Number of Clusters in a given Dataset

Here we use K-Means method for creating clusters for the given dataset.The main part of k-means is determining the
numer of clusters in the dataset.
For Finding out number of clusters,the method we use is called "elbow method".For understanding this we must know what is Inertia and Distortion

.

Distortion : It is calculated as the average of the squared distances from the cluster centers of the respective clusters. Typically, the Euclidean distance metric is used

Inertia : It is the sum of squared distances of samples to their closest cluster center.

Finding out the number of clusters

To determine the optimal number of clusters, we have to select the value of k at the “elbow” ie the point after which the distortion/inertia start decreasing in a linear fashion. Thus for the given data, we conclude that the optimal number of clusters for the data is 3.

Modules Used

1.Pandas
2.Numpy
3.Sckit-Learn
4.MatplotLib

Dataset Used

https://bit.ly/3kXTdox

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Iris.csv		Iris.csv
LICENSE		LICENSE
README.md		README.md
SPARKS FOUNDATION TASK 1.ipynb		SPARKS FOUNDATION TASK 1.ipynb
inertia.png		inertia.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finding Optimum Number of Clusters in a given Dataset

Finding out the number of clusters

Modules Used

Dataset Used

About

Releases

Packages

Languages

License

jatinrastogi/Finding-Optimum-Number-of-Clusters-in-a-given-Dataset

Folders and files

Latest commit

History

Repository files navigation

Finding Optimum Number of Clusters in a given Dataset

Finding out the number of clusters

Modules Used

Dataset Used

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages