- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# How to use DBSCAN clustering algorithm in Python Scikit-learn?

**DBSCAN** stands for Density-based spatial clustering of applications with noise. This algorithm is based on the intuitive notion of “clusters” & “noise” that clusters are dense regions of the lower density in the data space, separated by lower density regions of data points.

**Scikit-learn** have sklearn.cluster.DBSCAN module to perform DBSCAN clustering. There are two important parameters namely min_samples and eps used by this algorithm to define dense. Higher value of parameter min_samples or lower value of the parameter eps will give an indication about the higher density of data points which is necessary to form a cluster.

## Steps

We can follow the below given steps to perform DBSCAN clustering algorithm in Python Scikit-learn −

**Step 1** − Import required libraries.

Step** 2** − Set feagure size.

Step** 3** − Define a binary classification dataset having 2000 samples with two input features and one cluster per class.

Step** 4** − Create a scatter plot for all samples from each class.

Step** 5** − Define the DBSCAN clustering model.

Step** 6** − Fit the model.

Step** 7** − Retrieve the unique clusters from all clusters.

Step** 8** − Create the scatter plot for all samples from each cluster.

Step** 9** − Plot the figure.

## Example

For the example below, we will create a test binary classification dataset by using the make_classification() function. This dataset would consist of 2000 samples with two input features and one cluster per class.

# Import required libraries from numpy import unique from numpy import where from sklearn.datasets import make_classification from sklearn.cluster import DBSCAN from matplotlib import pyplot %matplotlib inline # Set the figure size pyplot.rcParams["figure.figsize"] = [7.16, 3.50] pyplot.rcParams["figure.autolayout"] = True # Define binary classification dataset having 2000 samples with two input features and one cluster per class. X,y = make_classification(n_samples=2000, n_features=2, n_informative=2, n_redundant=0, n_clusters_per_class=1, random_state=4) # Create scatter plot for all samples from each class for value in range(2): # Getting row indexes for samples row = where(y == value) # Creating scatter plot of all the samples pyplot.scatter(X[row, 0], X[row, 1]) # Plot the figure pyplot.title('Classification Dataset', size ='18') pyplot.show() # Define the DBSCAN clustering model DBSCAN_model = DBSCAN(eps=0.50, min_samples=10) # Fit the model yc = DBSCAN_model.fit_predict(X) # Retrieve the unique clusters from all clusters clusters_AC = unique(yc) # Create scatter plot for all samples from each cluster for cluster in clusters_AC: # Getting row indexes for all samples within this cluster row = where(yc == cluster) # Creating scatter plot of all the samples pyplot.scatter(X[row, 0], X[row, 1]) # Plot the figure pyplot.title('Cluster Prediction for Each Example in Dataset', size ='18') pyplot.show()

## Output

It will produce the following output −

- Related Questions & Answers
- Explain the basics of scikit-learn library in Python?
- How to find contours of an image using scikit-learn in Python?
- How can scikit learn library be used to preprocess data in Python?
- How can scikit-learn library be used to load data in Python?
- How can data be scaled using scikit-learn library in Python?
- How to view the pixel values of an image using scikit-learn in Python?
- How to eliminate mean values from feature vector using scikit-learn library in Python?
- Learning Model Building in Scikit-learn: A Python Machine Learning Library
- Explain how L1 Normalization can be implemented using scikit-learn library in Python?
- Explain how L2 Normalization can be implemented using scikit-learn library in Python?
- How can a specific tint be added to grayscale images in scikit-learn in Python?
- How can scikit learn library be used to upload and view an image in Python?
- How can scikit-learn be used to convert an image from RGB to grayscale in Python?
- What is hysteresis thresholding? How can it be achieved using scikit-learn in Python?
- How can scikit-learn library be used to get the resolution of an image in Python?
- What is an Agglomerative Clustering Algorithm?