- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# What is model-based clustering?

Model-based clustering is a statistical approach to data clustering. The observed (multivariate) data is considered to have been created from a finite combination of component models. Each component model is a probability distribution, generally a parametric multivariate distribution.

For instance, in a multivariate Gaussian mixture model, each component is a multivariate Gaussian distribution. The component responsible for generating a particular observation determines the cluster to which the observation belongs.

Model-based clustering is a try to advance the fit between the given data and some mathematical model and is based on the assumption that data are created by a combination of a basic probability distribution.

There are the following types of model-based clustering are as follows −

**Statistical approach** − Expectation maximization is a popular iterative refinement algorithm. An extension to k-means −

It can assign each object to a cluster according to weight (probability distribution).

New means are computed based on weight measures.

The basic idea is as follows −

It can start with an initial estimate of the parameter vector.

It can be used to iteratively rescore the designs against the mixture density made by the parameter vector.

It is used to rescored patterns are used to update the parameter estimates.

It can be used to pattern belonging to the same cluster if they are placed by their scores in a particular component.

## Algorithm

Initially, assign k cluster centers randomly.

It can be iteratively refined the clusters based on two steps are as follows −

**Expectation step** − It can assign each data point X_{i} to cluster C_{i} with the following probability

$$\mathrm{P(X_{i}\in\:C_{k})\:=\:P(C_k\arrowvert\:X_i)\:=\:\frac{P(C_k)P(X_i\arrowvert\:C_k)}{P(X_i)}}$$

**Maximization step** − It can be used to estimate of model parameter

$$\mathrm{m_k\:=\:\frac{1}{N}\displaystyle\sum\limits_{i=1}^N \frac{X_{i}P(X_i\:\in\:C_k)}{X_{j}P(X_i)\in\:C_j}}$$

**Machine learning approach** − Machine learning is an approach that makes complex algorithms for huge data processing and supports results to its users. It uses complex programs that can understand through experience and create predictions.

The algorithms are improved by themselves by frequent input of training information. The main objective of machine learning is to learn data and build models from data that can be understood and used by humans.

It is a famous approach of incremental conceptual learning, which produces a hierarchical clustering in the form of a classification tree. Each node defines a concept and includes a probabilistic representation of that concept.

**Limitations**

The assumption that the attributes are independent of each other is often too strong because correlation can exist.

It is not suitable for clustering large database data, skewed trees, and expensive probability distributions.

**Neural Network Approach** − The neural network approach represents each cluster as an example, acting as a prototype of the cluster. The new objects are distributed to the cluster whose example is the most similar according to some distance measure.

- Related Articles
- What is Prototype-Based Clustering?
- What is STING grid-based clustering?
- What are the algorithms of Grid-Based Clustering?
- What are the approaches of Graph-based clustering?
- What is Clustering?
- What is Conceptual Clustering?
- What is Multirelational clustering?
- What is K-means clustering?
- What is Agglomerative Hierarchical Clustering?
- What is Multi-relational Clustering?
- What is Document Clustering Analysis?
- Model-Based Testing Tutorial
- What is clustering Index in DBMS?
- What is an Agglomerative Clustering Algorithm?
- What is scipy cluster hierarchy? How to cut hierarchical clustering into flat clustering?