SciPy - single() Method



The SciPy single() method performs the task of single/minimimum/nearest linkage on a condensed matrix. The usage of this method is to define the distance between two clusters like a shortest distance between two different points.

In the field of data science, we use this method for cluster Analysis. It is used to identify the data pattern recognition, grouping items, and anomaly detection.

Syntax

Following is the syntax of the SciPy single() method −

single(y)

Parameters

This function accepts only a single parameters −

  • y: This variable is accepted as a parameter in the method single() that stores the list array integers for data plotting.

Return value

This method returns the linkage matrix(result).

Example 1

Following is the SciPy single() method illustrate the task of single linkage by plotting the data points.

import numpy as np
from scipy.cluster.hierarchy import single, dendrogram
import matplotlib.pyplot as plt

# Distance matrix
y = np.array([0.5, 0.2, 0.3, 0.4, 0.8, 0.6])

# single linkage clustering
result = single(y)

# Plot the dendrogram
plt.figure(figsize=(6, 4))
dendrogram(result)
plt.title('Dendrogram - Single Linkage')
plt.xlabel('indexes')
plt.ylabel('Distance')
plt.show()

Output

The above code produces the following result −

scipy_single_method_one

Example 2

Below the example demonstrate the task of single linkage clustering on random data.

import numpy as np
from scipy.spatial.distance import pdist
from scipy.cluster.hierarchy import single, dendrogram
import matplotlib.pyplot as plt

# Generate random data
data = np.random.rand(5, 2)

# calculate the distance matrix
y = pdist(data, metric='euclidean')

# single linkage clustering
result = single(y)

# Plot the dendrogram
plt.figure(figsize=(6, 4))
dendrogram(result)
plt.title('Dendrogram - Single Linkage on Random Data')
plt.xlabel('indexes')
plt.ylabel('Distance')
plt.show()

Output

The above code produces the following result −

scipy_single_method_two

Example 3

In this example, we use the metric type as cityblock to calculate the distance matrix from a given dataset and utilize the single linkage clustering. Then we use the method dendogram() which is helpful for plotting and visualizing the clustering.

import numpy as np
from scipy.spatial.distance import pdist
from scipy.cluster.hierarchy import single, dendrogram
import matplotlib.pyplot as plt

# Given data
data = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])

# calculate the distance matrix using a custom metric
y = pdist(data, metric='cityblock')

# single linkage clustering
result = single(y)

# Plot the dendrogram
plt.figure(figsize=(6, 4))
dendrogram(result)
plt.title('Dendrogram - Single Linkage with Cityblock Distance')
plt.xlabel('indexes')
plt.ylabel('Distance')
plt.show()

Output

The above code produces the following result −

scipy_single_method_three
scipy_reference.htm
Advertisements