Here we use K-Means method for creating clusters for the given dataset.The main part of k-means is determining the
numer of clusters in the dataset.
For Finding out number of clusters,the method we use is called "elbow method".For understanding this we must know what is Inertia
and Distortion
- Distortion : It is calculated as the average of the squared distances from the cluster centers of the respective clusters. Typically, the Euclidean distance metric is used
- Inertia : It is the sum of squared distances of samples to their closest cluster center.
.
To determine the optimal number of clusters, we have to select the value of k at the “elbow” ie the point after which the distortion/inertia start decreasing in a linear fashion. Thus for the given data, we conclude that the optimal number of clusters for the data is 3.
-
Iris Dataset : https://bit.ly/3kXTdox