
- Scikit Learn Tutorial
- Scikit Learn - Home
- Scikit Learn - Introduction
- Scikit Learn - Modelling Process
- Scikit Learn - Data Representation
- Scikit Learn - Estimator API
- Scikit Learn - Conventions
- Scikit Learn - Linear Modeling
- Scikit Learn - Extended Linear Modeling
- Stochastic Gradient Descent
- Scikit Learn - Support Vector Machines
- Scikit Learn - Anomaly Detection
- Scikit Learn - K-Nearest Neighbors
- Scikit Learn - KNN Learning
- Classification with Naïve Bayes
- Scikit Learn - Decision Trees
- Randomized Decision Trees
- Scikit Learn - Boosting Methods
- Scikit Learn - Clustering Methods
- Clustering Performance Evaluation
- Dimensionality Reduction using PCA
- Scikit Learn Useful Resources
- Scikit Learn - Quick Guide
- Scikit Learn - Useful Resources
- Scikit Learn - Discussion
Scikit Learn - LASSO
LASSO (Least Absolute Shrinkage and Selection Operator)
LASSO is the regularisation technique that performs L1 regularisation. It modifies the loss function by adding the penalty (shrinkage quantity) equivalent to the summation of the absolute value of coefficients.
$$\displaystyle\sum\limits_{j=1}^m\left(Y_{i}-W_{0}-\displaystyle\sum\limits_{i=1}^nW_{i}X_{ji} \right)^{2}+\alpha\displaystyle\sum\limits_{i=1}^n| W_i|=loss_{-}function+\alpha\displaystyle\sum\limits_{i=1}^n|W_i|$$sklearn.linear_model. Lasso is a linear model, with an added regularisation term, used to estimate sparse coefficients.
Parameters
Followings table consist the parameters used by Lasso module −
Sr.No | Parameter & Description |
---|---|
1 |
alpha − float, optional, default = 1.0 Alpha, the constant that multiplies the L1 term, is the tuning parameter that decides how much we want to penalize the model. The default value is 1.0. |
2 |
fit_intercept − Boolean, optional. Default=True This parameter specifies that a constant (bias or intercept) should be added to the decision function. No intercept will be used in calculation, if it will set to false. |
3 |
tol − float, optional This parameter represents the tolerance for the optimization. The tol value and updates would be compared and if found updates smaller than tol, the optimization checks the dual gap for optimality and continues until it is smaller than tol. |
4 |
normalize − Boolean, optional, default = False If this parameter is set to True, the regressor X will be normalized before regression. The normalization will be done by subtracting the mean and dividing it by L2 norm. If fit_intercept = False, this parameter will be ignored. |
5 |
copy_X − Boolean, optional, default = True By default, it is true which means X will be copied. But if it is set to false, X may be overwritten. |
6 |
max_iter − int, optional As name suggest, it represents the maximum number of iterations taken for conjugate gradient solvers. |
7 |
precompute − True|False|array-like, default=False With this parameter we can decide whether to use a precomputed Gram matrix to speed up the calculation or not. |
8 |
warm_start − bool, optional, default = false With this parameter set to True, we can reuse the solution of the previous call to fit as initialization. If we choose default i.e. false, it will erase the previous solution. |
9 |
random_state − int, RandomState instance or None, optional, default = none This parameter represents the seed of the pseudo random number generated which is used while shuffling the data. Followings are the options −
|
10 |
selection − str, default=‘cyclic’
|
Attributes
Followings table consist the attributes used by Lasso module −
Sr.No | Attributes & Description |
---|---|
1 |
coef_ − array, shape(n_features,) or (n_target, n_features) This attribute provides the weight vectors. |
2 |
Intercept_ − float | array, shape = (n_targets) It represents the independent term in decision function. |
3 |
n_iter_ − int or array-like, shape (n_targets) It gives the number of iterations run by the coordinate descent solver to reach the specified tolerance. |
Implementation Example
Following Python script uses Lasso model which further uses coordinate descent as the algorithm to fit the coefficients −
from sklearn import linear_model Lreg = linear_model.Lasso(alpha = 0.5) Lreg.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
Output
Lasso(alpha = 0.5, copy_X = True, fit_intercept = True, max_iter = 1000, normalize = False, positive = False, precompute = False, random_state = None, selection = 'cyclic', tol = 0.0001, warm_start = False)
Example
Now, once fitted, the model can predict new values as follows −
Lreg.predict([[0,1]])
Output
array([0.75])
Example
For the above example, we can get the weight vector with the help of following python script −
Lreg.coef_
Output
array([0.25, 0. ])
Example
Similarly, we can get the value of intercept with the help of following python script −
Lreg.intercept_
Output
0.75
Example
We can get the total number of iterations to get the specified tolerance with the help of following python script −
Lreg.n_iter_
Output
2
We can change the values of parameters to get the desired output from the model.