- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# How does the k-means algorithm work?

The k-means algorithm creates the input parameter, k, and division a group of n objects into k clusters so that the resulting intracluster similarity is large but the intercluster analogy is low. Cluster similarity is computed regarding the mean value of the objects in a cluster, which can be looked as the cluster’s centroid or center of gravity.

The k-means algorithm proceeds as follows. First, it can randomly choose k of the objects, each of which originally defines a cluster mean or center. For each of the remaining objects, an object is created to the cluster to which it is the same, depends on the distance among the object and the cluster mean.

It can calculates the new mean for each cluster. This phase iterates until the principle function converges. Generally, the square-error criterion is represented as −

$$\mathrm{E=\displaystyle\sum\limits_{i=1}^k\displaystyle\sum\limits_{p\epsilon C_{i}}|p-m_{i}|^2}$$

Where E is the total of the square error for some objects in the data set. p is the point in space defining a given object and m_{i} is the mean of cluster C_{i} (both p and m_{i}are multidimensional). Particularly, for each object in each cluster, the distance from the object to its cluster center is squared, and the distances are estimated. This criterion tries to create the resulting k clusters as compact and as independent as applicable.

**Algorithm:**** k-means** − The k-means algorithm for partitioning, where every cluster’s center is defined by the mean value of the objects in the cluster.

Input −

k: the number of clusters, D: a data set including n objects.

Output −

A set of k clusters.

Method −

arbitrarily select k objects from D as the original cluster centers;

repeat

(re)assign each object to the cluster to which the object is the same, depends on the mean value of the objects in the cluster;

update the cluster means, i.e., compute the mean value of the objects for each cluster;

until no change;

It is used to arbitrarily select three objects as the three original cluster centers, where cluster centers are denoted by a “+”. Each object is distributed to a cluster depends on the cluster center to which it is the convenient.

Next, the cluster centers are updated. The mean value of each cluster is recomputed based on the prevailing objects in the cluster. By utilizing the new cluster centers, the objects are redistributed to the clusters depends on which cluster center is the adjacent. Such a redistribution structure new silhouettes surrounded by dashed curves.

The phase of iteratively reassigning objects to clusters to enhance the partitioning is defined as iterative relocation. There is no redistribution of the objects in any cluster appears, and so the process removes. The resulting clusters are restored by the clustering phase.

- Related Questions & Answers
- What are the additional issues of K-Means Algorithm in data mining?
- What is the Bisecting K-Means?
- What is K-means clustering?
- How does the MD5 Algorithm works?
- What does /* in MySQL means?
- How does the Microstrip antenna work?
- How does the Selenium WebDriver work?
- How does the discordancy testing work?
- How does the WannaCry malware work?
- How does jQuery.scrollTop() work?
- How does jQuery.scrollLeft() work?
- How does classification work?
- How does backpropagation work?
- How does RSA work?
- What is the difference between K-Means and DBSCAN?