- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# Which SciPy package is used to implement Clustering?

Clustering is one among the most useful unsupervised ML methods. It is used to find the relationship patterns and similarity among the input data samples. After finding these patterns, unsupervised algorithm clusters the data samples having similarities into groups as illustrated in the diagram below −

Anomaly detection, image segmentation, medical imaging, social network analysis, and market segmentation are some common applications for clustering. K-means and Hierarchical are the two most common forms of clustering.

To implement clustering, SciPy provides us a clustering package (scipy.cluster) which further has two modules as given below −

**scipy.cluster.vq module**− This SciPy module provides functions for k-means clustering and vector quantization. It also generates code books from k-means models by comparing them with centroids in a code book. The table below explains the routines, along with their description, consisting in scipy.cluster.vq module−

Routine | Description |
---|---|

scipy.cluster.vq.whiten(obs, check_finite=True ) | This routine normalizes a group of observations on features. |

scipy.cluster.vq.vq(obs, code_book,check_finite=True) | This routine assigns codes from a codebook to observation. |

scipy.cluster.vq.kmeans(obs, k_or_guess, iter=20, thresh=1e-05, check_finite=True) | This routine performs k-means algorithms on a set of observation vectors forming kclusters. |

scipy.cluster.vq.kmeans2(data,k,iter=10, thresh=1e-05, minit='random', missing='warn', check_finite=True) | This routine classifies a set of observations into k-clusters by using the k-means algorithm. |

**scipy.cluster.hierarchy module**− As name suggested, this SciPy module provides functions for hierarchical clustering and its types such as agglomerative clustering. It has various routines which we can use to−Compute statistics on hierarchies

Cut hierarchical clustering into the flat clustering.

Implement agglomerative clustering.

Visualize flat clustering.

To check isomorphism of two flat cluster assignments.

Plot the clusters.

- Related Articles
- What is scipy cluster hierarchy? How to cut hierarchical clustering into flat clustering?
- Which linear function of SciPy is used to solve triangular matrix equations?
- Which linear function of SciPy is used to solve a banded matrix equation?
- Which linear function of SciPy is used to solve the circulant matrix equation?
- Implementing K-means clustering of Diabetes dataset with SciPy library
- Which linear function of SciPy is used to solve Toeplitz matrix using Levinson Recursion?
- Which linear function of SciPy is used to solve Hermitian positive-definite banded matrix equation?
- Which package is used for pattern matching with regular expressions in java?
- How to implement ‘cubic’ 1-D interpolation using SciPy library?
- Implementing K-means clustering with SciPy by splitting random data in 2 clusters?
- Implementing K-means clustering with SciPy by splitting random data in 3 clusters?
- SciPy is built upon which core packages?
- Which function should be used to load a package in R, require or library?
- How semaphore is used to implement mutual exclusion?
- What is interpolation and how can we implement it in the SciPy Python library?