 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What is scipy.cluster.vq.kmeans2()method?
scipy.cluster.vq.kmeans2(data, k, iter=10, thresh=1e-05, minit='random', missing='warn', check_finite=True)− The kmeans2() method classify a set of observations vectors into k clusters by performing k-means algorithm. To check for convergence, the kmeans2() method does not use threshold values. It has additional parameters to decide the method of initialization of centroids, to handle empty clusters, and to validate if the input metrices contain only finite numbers or not.
Below is given the detailed explanation of its parameters −
Parameters
- 
data− ndarray It is an ‘M’ by ‘N’ array of M observations in N dimension. 
- 
k− int or ndarrayThis parameter represents the number of clusters to form and the centroids to generate. It is interpreted as initial cluster to use in case of the two conditions given below − - When minit initialization string is ‘matrix’. 
- or if a ndarray is given. 
 
- 
thresh− float, optionalThis parameter represents the threshold value. If the change in distortion since the last iteration is less than or equal to this threshold value, the algorithm will be terminated by default. 
- 
minit− str, optional This parameter represents the method for initialization. Below are given some available methods for the same − - random− It generates k centroids from a Gaussian with mean and variance. The mean and variance are estimated from the data. 
- points− This method chooses k observations i.e., rows randomly from data for the initial centroids. 
- ++− This method, also called careful seeding, choose k observations i.e., rows to the kmeans++ method. 
- matrix− The matrix method interprets the k parameter (as ‘k’ by ‘M’ array) of initial centroids. 
 
- missing− str, optional 
This parameter represents the method to deal with empty clusters. Below are the available methods −
- warn− This method, as name implies, give a warning, and continue. 
- raise− This method will raise an error (ClusterError) and terminate the algorithm. 
- 
check_finite− bool, optional This parameter is used to check whether the input matrices contain only finite numbers. Disabling this parameter may give you a performance gain but it may also result in some problems like crashes or non-termination if the observations do contain infinites. The default value of this parameter is True. 
Returns
- 
centroid− ndarray It returns a k by N array of centroids. 
- 
label− ndarray This is the index of the centroid. 
