Gaurav Leekha has Published 20 Answers

How to use DBSCAN clustering algorithm in Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:48:56

DBSCAN stands for Density-based spatial clustering of applications with noise. This algorithm is based on the intuitive notion of “clusters” & “noise” that clusters are dense regions of the lower density in the data space, separated by lower density regions of data points. Scikit-learn have sklearn.cluster.DBSCAN module to perform DBSCAN ... Read More

How to use Affinity Propagation clustering algorithm in Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:46:43

Affinity propagation clustering algorithm is based on the concept of ‘message passing’ between different pairs of samples until convergence. It does not require the number of clusters to be specified before running the algorithm. One of the biggest disadvantages of this algorithm is its time complexity which is of the ... Read More

How to use K-Means clustering algorithm in Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:43:08

K-Means clustering algorithm computes the centroids and iterates until it finds optimal centroid. It requires the number of clusters to be specified that’s why it assumes that they are already known. The main logic of this algorithm is to cluster the data separating samples in n number of groups of ... Read More

How to implement linear classification with Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:40:49

Linear classification is one of the simplest machine learning problems. To implement linear classification, we will be using sklearn’s SGD (Stochastic Gradient Descent) classifier to predict the Iris flower species. Steps You can follow the below given steps to implement linear classification with Python Scikit-learn − Step 1 − First ... Read More

How to transform Scikit-learn IRIS dataset to 2-feature dataset in Python?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:38:18

Iris, a multivariate flower dataset, is one of the most useful Pyhton scikit-learn datasets. It has 3 classes of 50 instances each and contains the measurements of the sepal and petal parts of three Iris species namely Iris setosa, Iris virginica, and Iris versicolor. Along with that Iris dataset contains ... Read More

How to transform Sklearn DIGITS dataset to 2 and 3-feature dataset in Python?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:35:06

Sklearn DIGITS dataset has 64 features as each image of the digit is of size 8 by 8 pixels. We can use Principal Component Analysis (PCA) to transform Scikit-learn DIGITS dataset into new feature space with 2 features. Transforming 64 features dataset to 2-feature dataset will be a big reduction ... Read More

How to perform dimensionality reduction using Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:32:09

Dimensionality reduction, an unsupervised machine learning method is used to reduce the number of feature variables for each data sample selecting set of principal features. Principal Component Analysis (PCA) is one of the popular algorithms for dimensionality reduction available in Sklearn. In this tutorial, we perform dimensionality reduction using principal ... Read More

How to implement Random Projection using Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:29:24

Random projection is a dimensionality reduction and data visualization method to simplify the complexity of highly dimensional data. It is basically applied to the data where other dimensionality reduction techniques such as Principal Component Analysis (PCA) can not do the justice to data. Python Scikit-learn provides a module named sklearn.random_projection ... Read More

How to build Naive Bayes classifiers using Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:25:42

Naïve Bayes classification, based on the Bayes theorem of probability, is the process of predicting the category from unknown data sets. Scikit-learn has three Naïve Bayes models namely, Gaussian Naïve Bayes Bernoulli Naïve Bayes Multinomial Naïve Bayes In this tutorial, we will learn Gaussian Naïve Bayes and Bernoulli ... Read More

How to create a random forest classifier using Python Scikit-learn?

Gaurav Leekha

Gaurav Leekha

Updated on 04-Oct-2022 08:22:46

Random forest is a supervised machine learning algorithm that is used for classification, regression, and other tasks by creating decision trees on data samples. After creating the decision trees, a random forest classifier collects the prediction from each of them and selects the best solution by means of voting. One ... Read More

Advertisements