- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to calculate the prediction accuracy of logistic regression?
Logistic regression is a statistical approach for examining the connection between a dependent variable and one or more independent variables. It is a form of regression analysis frequently used for classification tasks when the dependent variable is binary (i.e., takes only two values). Finding the link between the independent factors and the likelihood that the dependent variable will take on a certain value is the aim of logistic regression.
Since it enables us to predict the likelihood of an event occurring based on the values of the independent variables, logistic regression is a crucial tool in data analysis and machine learning. It is commonly utilized in industries where prognosticating results is essential, including healthcare, finance, and marketing.
The accuracy of a logistic regression model's predictions of outcomes is a crucial metric of the model's performance. The accuracy score displays what proportion of all forecasts were correct in relation to the total number of guesses. A model is providing more accurate forecasts when its accuracy rating is greater; conversely, a model is producing more inaccurate predictions when its accuracy rating is lower. In this post, we'll look at how to assess the prediction accuracy of logistic regression.
Calculating Prediction Accuracy of Logistic Regression
Here is an example Python program that uses the scikit-learn module to determine the logistic regression's prediction accuracy using data from a real dataset −
To calculate the prediction accuracy of logistic regression, here are the steps we will follow −
First, we will import all the necessary modules from sklearn.
Then we will load the dataset.
Splitting the data into training and testing sets.
Then, we will be creating a logistic regression model.
At last, we will predict the accuracy of the test set.
In this example, we first use the scikit-learn load breast cancer method to load the breast cancer dataset. Thereafter, we used the train test split function to divide the dataset into training and testing sets. The next step is to use the LogisticRegression class to generate a logistic regression model, which is then fitted to the training set of data using the fit method. The prediction accuracy is then determined by using the scikit-learn accuracy score function to the testing data and leveraging the prediction method to create predictions. Lastly, we output the console with prediction accuracy.
Example
# Import necessary libraries from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score # Load the breast cancer dataset data = load_breast_cancer() # Split the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3, random_state=42) # Create a logistic regression model lr = LogisticRegression() # Fit the model on the training data lr.fit(X_train, y_train) # Make predictions on the testing data y_pred = lr.predict(X_test) # Calculate the prediction accuracy accuracy = accuracy_score(y_test, y_pred) # Print the prediction accuracy print("Prediction Accuracy:", accuracy)
Output
Prediction Accuracy: 0.9707602339181286
Conclusion
In conclusion, prediction accuracy is a key factor in determining how well a logistic regression model performs. The accuracy score indicates what portion of the predictions the model produced was correct. A higher accuracy number indicates more accurate predictions from the model, whilst a lower score indicates less accurate predictions from the model.