Skip to content

jatinrastogi/Finding-Optimum-Number-of-Clusters-in-a-given-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Finding Optimum Number of Clusters in a given Dataset

Here we use K-Means method for creating clusters for the given dataset.The main part of k-means is determining the
numer of clusters in the dataset.
For Finding out number of clusters,the method we use is called "elbow method".For understanding this we must know what is Inertia and Distortion

.
  1. Distortion : It is calculated as the average of the squared distances from the cluster centers of the respective clusters. Typically, the Euclidean distance metric is used

  2. Inertia : It is the sum of squared distances of samples to their closest cluster center.

  3. .

    Finding out the number of clusters


    To determine the optimal number of clusters, we have to select the value of k at the “elbow” ie the point after which the distortion/inertia start decreasing in a linear fashion. Thus for the given data, we conclude that the optimal number of clusters for the data is 3.

    Modules Used

      1.Pandas
      2.Numpy
      3.Sckit-Learn
      4.MatplotLib
      

      Dataset Used

      Iris Dataset : https://bit.ly/3kXTdox

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published