what is k-means clustering?
K-means clustering is a method of prototype based clustering, it can group the data into k clusters in which eash data belongs to the cluster with the nearest mean.
Given the input data set , k-means clustering will partition the observations into set so as to minimize the within-cluster sum of squares.
where is the mean of points in .
clustering steps
- select random points as the class center point
- calculate the distance between each data point and the each class center point, then classify the data point to be in the group whose center point is closest to it.
- recompute each class center to the mean of all the vectors in the class.
- repeat the above steps for some iterations or until the centers don't change much between iterations.
References
https://en.wikipedia.org/wiki/K-means_clustering
https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
网友评论