Scikit Learn - Bayesian Ridge Regression

Bayesian regression allows a natural mechanism to survive insufficient data or poorly distributed data by formulating linear regression using probability distributors rather than point estimates. The output or response ‘y’ is assumed to drawn from a probability distribution rather than estimated as a single value.

Mathematically, to obtain a fully probabilistic model the response y is assumed to be Gaussian distributed around $X_{w}$𝑋as follows

$$p\left(y\arrowvert X,w,\alpha\right)=N\left(y\arrowvert X_{w},\alpha\right)$$

One of the most useful type of Bayesian regression is Bayesian Ridge regression which estimates a probabilistic model of the regression problem. Here the prior for the coefficient w is given by spherical Gaussian as follows −

$$p\left(w\arrowvert \lambda\right)=N\left(w\arrowvert 0,\lambda^{-1}I_{p}\right)$$

This resulting model is called Bayesian Ridge Regression and in scikit-learn sklearn.linear_model.BeyesianRidge module is used for Bayesian Ridge Regression.

Parameters

Followings table consist the parameters used by BayesianRidge module −

Sr.No Parameter & Description
1

n_iter − int, optional

It represents the maximum number of iterations. The default value is 300 but the user-defined value must be greater than or equal to 1.

2

fit_intercept − Boolean, optional, default True

It decides whether to calculate the intercept for this model or not. No intercept will be used in calculation, if it will set to false.

3

tol − float, optional, default=1.e-3

It represents the precision of the solution and will stop the algorithm if w has converged.

4

alpha_1 − float, optional, default=1.e-6

It is the 1st hyperparameter which is a shape parameter for the Gamma distribution prior over the alpha parameter.

5

alpha_2 − float, optional, default=1.e-6

It is the 2nd hyperparameter which is an inverse scale parameter for the Gamma distribution prior over the alpha parameter.

6

lambda_1 − float, optional, default=1.e-6

It is the 1st hyperparameter which is a shape parameter for the Gamma distribution prior over the lambda parameter.

7

lambda_2 − float, optional, default=1.e-6

It is the 2nd hyperparameter which is an inverse scale parameter for the Gamma distribution prior over the lambda parameter.

8

copy_X − Boolean, optional, default = True

By default, it is true which means X will be copied. But if it is set to false, X may be overwritten.

9

compute_score − boolean, optional, default=False

If set to true, it computes the log marginal likelihood at each iteration of the optimization.

10

verbose − Boolean, optional, default=False

By default, it is false but if set true, verbose mode will be enabled while fitting the model.

Attributes

Followings table consist the attributes used by BayesianRidge module −

Sr.No Attributes & Description
1

coef_ − array, shape = n_features

This attribute provides the weight vectors.

2

intercept_ − float

It represents the independent term in decision function.

3

alpha_ − float

This attribute provides the estimated precision of the noise.

4

lambda_ − float

This attribute provides the estimated precision of the weight.

5

n_iter_ − int

It provides the actual number of iterations taken by the algorithm to reach the stopping criterion.

6

sigma_ − array, shape = (n_features, n_features)

It provides the estimated variance-covariance matrix of the weights.

7

scores_ − array, shape = (n_iter_+1)

It provides the value of the log marginal likelihood at each iteration of the optimisation. In the resulting score, the array starts with the value of the log marginal likelihood obtained for the initial values of $a\:and\:\lambda$𝜆, and ends with the value obtained for estimated $a\:and\:\lambda$.

Implementation Example

Following Python script provides a simple example of fitting Bayesian Ridge Regression model using sklearn BayesianRidge module.

from sklearn import linear_model
X = [[0, 0], [1, 1], [2, 2], [3, 3]]
Y = [0, 1, 2, 3]
BayReg = linear_model.BayesianRidge()
BayReg.fit(X, Y)


Output

BayesianRidge(alpha_1 = 1e-06, alpha_2 = 1e-06, compute_score = False, copy_X = True,
fit_intercept = True, lambda_1 = 1e-06, lambda_2 = 1e-06, n_iter = 300,
normalize = False, tol=0.001, verbose = False)


From the above output, we can check model’s parameters used in the calculation.

Example

Now, once fitted, the model can predict new values as follows −

BayReg.predict([[1,1]])


Output

array([1.00000007])


Example

Similarly, we can access the coefficient w of the model as follows −

BayReg.coef_


Output

array([0.49999993, 0.49999993])