# How to use Affinity Propagation clustering algorithm in Python Scikit-learn?

PythonScikit-learnServer Side ProgrammingProgramming

#### Beyond Basic Programming - Intermediate Python

Most Popular

36 Lectures 3 hours

#### Practical Machine Learning using Python

Best Seller

91 Lectures 23.5 hours

#### Practical Data Science using Python

22 Lectures 6 hours

Affinity propagation clustering algorithm is based on the concept of ‘message passing’ between different pairs of samples until convergence. It does not require the number of clusters to be specified before running the algorithm. One of the biggest disadvantages of this algorithm is its time complexity which is of the order $0(N^2T)$

Scikit-learn have sklearn.cluster.AffinityPropagation module to perform Affinity Propagation clustering in Python.

## Steps

We can follow the below given steps to perform Affinity Propagation clustering algorithm in Python Scikit-learn −

Step 1 − Import necessary libraries.

Step 2 − Set the figure size.

Step 3 − Define binary classification dataset having 2000 samples with two input features and one cluster per class.

Step 4 − Create scatter plot for all samples from each class.

Step 5 − Plot the figure.

Step 6 − Define the AffinityPropagation clustering model.

Step 7 − Fit the model.

Step 8 − Assigning a cluster per sample.

Step 9 − Retrieve the unique clusters from all clusters.

Step 10 − Create scatter plot for all samples from each cluster.

Step 11 − Plot the figure.

## Example

For the example below, we will create a test binary classification dataset by using the make_classification() function. This dataset would consist of 2000 samples with two input features and one cluster per class.

# Import necessary libraries
from numpy import unique
from numpy import where
from sklearn.datasets import make_classification
from sklearn.cluster import AffinityPropagation
from matplotlib import pyplot
# %matplotlib inline

# Set the figure size
pyplot.rcParams["figure.figsize"] = [7.16, 3.50]
pyplot.rcParams["figure.autolayout"] = True

# Define binary classification dataset having 2000 samples with two input features and one cluster per class.
X,y = make_classification(n_samples=2000, n_features=2, n_informative=2, n_redundant=0, n_clusters_per_class=1, random_state=4)

# Create scatter plot for all samples from each class
for value in range(2):

# Getting row indexes for samples
row = where(y == value)

# Creating scatter plot of all the samples
pyplot.scatter(X[row, 0], X[row, 1])

# Plot the figure
pyplot.title('Classification Dataset', size ='18')
pyplot.show()

# Define the AffinityPropagation clustering model
AP_model = AffinityPropagation(damping=0.6)

# Fit the model
AP_model.fit(X)

# Assigning a cluster per sample
yc = AP_model.predict(X)

# Retrieve the unique clusters from all clusters
clusters_AP = unique(yc)

# Create scatter plot for all samples from each cluster
for cluster in clusters_AP:

# Getting row indexes for all samples within this cluster
row = where(yc == cluster)

# creating scatter plot of all the samples
pyplot.scatter(X[row, 0], X[row, 1])

# Plot the figure
pyplot.title('Cluster Prediction for Each Example in Dataset', size ='18')
pyplot.show()


## Output

It will produce the following output −

Updated on 04-Oct-2022 08:46:43