 Scikit Learn Tutorial
 Scikit Learn  Home
 Scikit Learn  Introduction
 Scikit Learn  Modelling Process
 Scikit Learn  Data Representation
 Scikit Learn  Estimator API
 Scikit Learn  Conventions
 Scikit Learn  Linear Modeling
 Scikit Learn  Extended Linear Modeling
 Stochastic Gradient Descent
 Scikit Learn  Support Vector Machines
 Scikit Learn  Anomaly Detection
 Scikit Learn  KNearest Neighbors
 Scikit Learn  KNN Learning
 Classification with Naïve Bayes
 Scikit Learn  Decision Trees
 Randomized Decision Trees
 Scikit Learn  Boosting Methods
 Scikit Learn  Clustering Methods
 Clustering Performance Evaluation
 Dimensionality Reduction using PCA
 Scikit Learn Useful Resources
 Scikit Learn  Quick Guide
 Scikit Learn  Useful Resources
 Scikit Learn  Discussion
Scikit Learn  Logistic Regression
Logistic regression, despite its name, is a classification algorithm rather than regression algorithm. Based on a given set of independent variables, it is used to estimate discrete value (0 or 1, yes/no, true/false). It is also called logit or MaxEnt Classifier.
Basically, it measures the relationship between the categorical dependent variable and one or more independent variables by estimating the probability of occurrence of an event using its logistics function.
sklearn.linear_model.LogisticRegression is the module used to implement logistic regression.
Parameters
Following table lists the parameters used by Logistic Regression module −
Sr.No  Parameter & Description 

1 
penalty − str, ‘L1’, ‘L2’, ‘elasticnet’ or none, optional, default = ‘L2’ This parameter is used to specify the norm (L1 or L2) used in penalization (regularization). 
2 
dual − Boolean, optional, default = False It is used for dual or primal formulation whereas dual formulation is only implemented for L2 penalty. 
3 
tol − float, optional, default=1e4 It represents the tolerance for stopping criteria. 
4 
C − float, optional, default=1.0 It represents the inverse of regularization strength, which must always be a positive float. 
5 
fit_intercept − Boolean, optional, default = True This parameter specifies that a constant (bias or intercept) should be added to the decision function. 
6 
intercept_scaling − float, optional, default = 1 This parameter is useful when

7 
class_weight − dict or ‘balanced’ optional, default = none It represents the weights associated with classes. If we use the default option, it means all the classes are supposed to have weight one. On the other hand, if you choose class_weight: balanced, it will use the values of y to automatically adjust weights. 
8 
random_state − int, RandomState instance or None, optional, default = none This parameter represents the seed of the pseudo random number generated which is used while shuffling the data. Followings are the options

9 
solver − str, {‘newtoncg’, ‘lbfgs’, ‘liblinear’, ‘saag’, ‘saga’}, optional, default = ‘liblinear’ This parameter represents which algorithm to use in the optimization problem. Followings are the properties of options under this parameter −

10 
max_iter − int, optional, default = 100 As name suggest, it represents the maximum number of iterations taken for solvers to converge. 
11 
multi_class − str, {‘ovr’, ‘multinomial’, ‘auto’}, optional, default = ‘ovr’

12 
verbose − int, optional, default = 0 By default, the value of this parameter is 0 but for liblinear and lbfgs solver we should set verbose to any positive number. 
13 
warm_start − bool, optional, default = false With this parameter set to True, we can reuse the solution of the previous call to fit as initialization. If we choose default i.e. false, it will erase the previous solution. 
14 
n_jobs − int or None, optional, default = None If multi_class = ‘ovr’, this parameter represents the number of CPU cores used when parallelizing over classes. It is ignored when solver = ‘liblinear’. 
15 
l1_ratio − float or None, optional, dgtefault = None It is used in case when penalty = ‘elasticnet’. It is basically the ElasticNet mixing parameter with 0 < = l1_ratio > = 1. 
Attributes
Followings table consist the attributes used by Logistic Regression module −
Sr.No  Attributes & Description 

1 
coef_ − array, shape(n_features,) or (n_classes, n_features) It is used to estimate the coefficients of the features in the decision function. When the given problem is binary, it is of the shape (1, n_features). 
2 
Intercept_ − array, shape(1) or (n_classes) It represents the constant, also known as bias, added to the decision function. 
3 
classes_ − array, shape(n_classes) It will provide a list of class labels known to the classifier. 
4 
n_iter_ − array, shape (n_classes) or (1) It returns the actual number of iterations for all the classes. 
Implementation Example
Following Python script provides a simple example of implementing logistic regression on iris dataset of scikitlearn −
from sklearn import datasets from sklearn import linear_model from sklearn.datasets import load_iris X, y = load_iris(return_X_y = True) LRG = linear_model.LogisticRegression( random_state = 0,solver = 'liblinear',multi class = 'auto' ) .fit(X, y) LRG.score(X, y)
Output
0.96
The output shows that the above Logistic Regression model gave the accuracy of 96 percent.