- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# What is K-means clustering?

K-means clustering is the most common partitioning algorithm. K-means reassigns each data in the dataset to only one of the new clusters formed. A record or data point is assigned to the nearest cluster using a measure of distance or similarity.

The k-means algorithm creates the input parameter, k, and division a group of n objects into k clusters so that the resulting intracluster similarity is large but the intercluster analogy is low. Cluster similarity is computed regarding the mean value of the objects in a cluster, which can be looked at as the cluster’s centroid or center of gravity.

There are the following steps used in the K-means clustering −

It can select K initial cluster centroid c

_{1}, c_{2}, c_{3}… . . c_{k}.It can assign each instance x in the S cluster whose centroid is nearest to x.

For each cluster, recompute its centroid based on which elements are contained in that cluster.

Go to (b) until convergence is completed.

It can separate the object (data points) into K clusters.

It is used to cluster center (centroid) = the average of all the data points in the cluster.

It can assign each point to the cluster whose centroid is nearest (using distance function).

The original values for the means are arbitrarily authorized. These can be assigned randomly or perhaps can use the values from the first k input items themselves. The convergence element can be based on the squared error, but they are required not to be. For example, the algorithm is assigned to different clusters. Other termination techniques have simply locked at a fixed number of iterations. A maximum number of iterations can be included to ensure shopping even without convergence.

## Algorithm

**Input** −

D = {t_{1}t_{2}… t_{n}} // Set of elements k // Number of desired clusters

**Output** −

K // Set of clusters

**K-means algorithm** −

assign initial values for means m_{1}m_{2}… . . m_{k}repeat assign each item t_{i}to the cluster which has the closest mean calculate the new mean for each cluster until convergence criteria are met

It is used to arbitrarily select three objects as the three original cluster centers, where cluster centers are denoted by a “+”. Each object is distributed to a cluster depending on the cluster center to which it is convenient.

Next, the cluster centers are updated. The mean value of each cluster is recomputed based on the prevailing objects in the cluster. By utilizing the new cluster centers, the objects are redistributed to the clusters depending on which cluster center is adjacent. Such a redistribution structure new silhouettes surrounded by dashed curves.

The procedure of iteratively recreating objects to clusters to improve the partitioning is defined as repetitive relocation. There is no redistribution of the objects in any cluster that appears, and so the process removes. The resulting clusters are restored by the clustering phase.

- Related Articles
- Implementing K-means clustering of Diabetes dataset with SciPy library
- What is the Bisecting K-Means?
- Implementing K-means clustering with SciPy by splitting random data in 2 clusters?
- Implementing K-means clustering with SciPy by splitting random data in 3 clusters?
- What is the difference between K-Means and DBSCAN?
- What is Clustering?
- What is Conceptual Clustering?
- What is Multirelational clustering?
- What is Agglomerative Hierarchical Clustering?
- What is Prototype-Based Clustering?
- What is model-based clustering?
- What is Multi-relational Clustering?
- What is Document Clustering Analysis?
- How does the k-means algorithm work?
- What is clustering Index in DBMS?