Why Logistic Regression in Classification in Machine Learning?


Introduction

Logistic Regression is a classification algorithm commonly used our machine learning for binary classification. Although the term "Regression" is in its name it is in fact a classification algorithm. It uses log odds with log loss or cross−entropy loss as the cost function.

In this article let us see why Logistic Regression is a classification algorithm in nature.

Logistic Regression as a Classification Algorithm

A Linear Regression algorithm can be given represented by a linear equation such that a univariate regression model with $\mathrm{\alpha_{0}}$ intercept and $\mathrm{\alpha_{1}}$ can be written as

$$\mathrm{y=\alpha_{0}+\alpha_{1}x}$$

The line of best fit is shown below the linear regression

However, in Logistic Regression values can lie only within 0 and on but in the case of Linear regression, values are continuous and lie beyond 0 and 1 as per the line of best fit.

This clearly shows that equation of linear regression has to be transformed so that it accommodates values with range [0,1]. This is accomplished by using a sigmoid function, particularly in logistic regression where it squashes the values between 0 and 1.

$$\mathrm{p(x)=\frac{1}{1+e^-({\alpha_{0}+\alpha_{1}}x)}}$$

Hence, we can model two classes using logistic regression.

The logistic function or the sigmoid function is $\mathrm{\frac{1}{1+e^{−t}}}$. Log odds are the inverse of the logistic function. From the equation of linear regression, we saw that it can output real values from negative to positive infinity and within 0,1 boundaries. But after we transformed the linear regression equation with the sigmoid function the probability values of the transformed function started to lie between 0 and 1. This proves that logistic regression is a classification algorithm and cannot be used for regression. In other words, Logistic Regression classifies the values of a Linear Regression based on a boundary or class.

Multiclass classification Logistic Regression

Using logistic regression we can accomplish multiclass classification. for both binary and multiclass classification the central idea is the same however in Multiclass classification we use the concept of one−vs−all classification. there would be multiple independent variables as per the equation below

$$\mathrm{\log\frac{p}{1−p}=\alpha_{0}+\alpha_{1}x_{1}+\alpha_{2}x_{2}+\alpha_{3}x_{3}+........+\alpha_{n}x_{n}}$$

In the above equation, we see that the value of log odds depends upon n number of independent variables

Conclusion

Logistic Regression is a classification algorithm it outputs probability values between 0 and 1 instead of continuous values. This is due to the sigmoid function transformation applied to the linear regression equation. Logistic regression can be used both for binary and multiclass classification.

Updated on: 27-Aug-2023

78 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements