
- Scikit Learn Tutorial
- Scikit Learn - Home
- Scikit Learn - Introduction
- Scikit Learn - Modelling Process
- Scikit Learn - Data Representation
- Scikit Learn - Estimator API
- Scikit Learn - Conventions
- Scikit Learn - Linear Modeling
- Scikit Learn - Extended Linear Modeling
- Stochastic Gradient Descent
- Scikit Learn - Support Vector Machines
- Scikit Learn - Anomaly Detection
- Scikit Learn - K-Nearest Neighbors
- Scikit Learn - KNN Learning
- Classification with Naïve Bayes
- Scikit Learn - Decision Trees
- Randomized Decision Trees
- Scikit Learn - Boosting Methods
- Scikit Learn - Clustering Methods
- Clustering Performance Evaluation
- Dimensionality Reduction using PCA
- Scikit Learn Useful Resources
- Scikit Learn - Quick Guide
- Scikit Learn - Useful Resources
- Scikit Learn - Discussion
Scikit Learn - Bernoulli Naïve Bayes
Bernoulli Naïve Bayes is another useful naïve Bayes model. The assumption in this model is that the features binary (0s and 1s) in nature. An application of Bernoulli Naïve Bayes classification is Text classification with ‘bag of words’ model. The Scikit-learn provides sklearn.naive_bayes.BernoulliNB to implement the Gaussian Naïve Bayes algorithm for classification.
Parameters
Following table consist the parameters used by sklearn.naive_bayes.BernoulliNB method −
Sr.No | Parameter & Description |
---|---|
1 | alpha − float, optional, default = 1.0 It represents the additive smoothing parameter. If you choose 0 as its value, then there will be no smoothing. |
2 | binarize − float or None, optional, default = 0.0 With this parameter we can set the threshold for binarizing of sample features. Binarization here means mapping to the Booleans. If you choose its value to be None it means input consists of binary vectors. |
3 | fit_prior − Boolean, optional, default = true It tells the model that whether to learn class prior probabilities or not. The default value is True but if set to False, the algorithms will use a uniform prior. |
4 | class_prior − array-like, size(n_classes,), optional, Default = None This parameter represents the prior probabilities of each class. |
Attributes
Following table consist the attributes used by sklearn.naive_bayes.BernoulliNB method −
Sr.No | Attributes & Description |
---|---|
1 | class_log_prior_ − array, shape(n_classes,) It provides the smoothed log probability for every class. |
2 | class_count_ − array, shape(n_classes,) It provides the actual number of training samples encountered for each class. |
3 | feature_log_prob_ − array, shape (n_classes, n_features) It gives the empirical log probability of features given a class $P\left(\begin{array}{c} features\arrowvert Y\end{array}\right)$. |
4 | feature_count_ − array, shape (n_classes, n_features) It provides the actual number of training samples encountered for each (class,feature). |
The methods of sklearn.naive_bayes.BernoulliNB are same as we have used in sklearn.naive_bayes.GaussianNB.
Implementation Example
The Python script below will use sklearn.naive_bayes.BernoulliNB method to construct Bernoulli Naïve Bayes Classifier from our data set −
import numpy as np X = np.random.randint(10, size = (10, 1000)) y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) from sklearn.naive_bayes import BernoulliNB BNBclf = BernoulliNB() BNBclf.fit(X, y)
Output
BernoulliNB(alpha = 1.0, binarize = 0.0, class_prior = None, fit_prior = True)
Now, once fitted we can predict the new value by using predict() method as follows −
Example
print((BNBclf.predict(X[0:5]))
Output
[1 2 3 4 5]