What is the difference between Bagging and Boosting?


Bagging

Bagging is also known as bootstrap aggregation. It is the ensemble learning method that is generally used to reduce variance within a noisy dataset. In bagging, a random sample of data in a training set is selected with replacement meaning that the single data points can be selected more than once.

After several data samples are generated, these weak models are trained separately and depend on the element of task regression or classification. For example, the average of those predictions yield a more efficient estimate.

Random Forest is an extension over bagging. It takes one more step to predict a random subset of records. It also creates a random selection of features instead of using all features to develop trees. When it can have several random trees, it is known as the Random Forest.

Bagging has also been leveraged with deep learning models in the finance market, automating critical functions, such as fraud detection, credit risk computations, and option pricing issues.

This research demonstrates how bagging between several machine learning techniques has been leveraged to create loan default risk. This study understands how bagging supports minimizing risk by avoiding credit card fraud within the banking and financial institutions.

Boosting

Boosting is another ensemble process to create a set of predictors. In another terms, it can fit consecutive trees, generally random samples, and at every phase, the objective is to solve net error from the previous trees.

Boosting is generally used to reduce the bias and variance in a supervised learning technique. It defines the family of an algorithm that changes weak learners (base learners) to strong learners. The weak learner is the classifiers that are correct only up to a small extent with the actual classification, while the strong learners are the classifiers that are well correlated with the actual classification.

Let us see the comparison between Bagging and Boosting.

BaggingBoosting
Objective to decrease variance, not bias.Objective to decrease bias, not variance.
Each model is built independently.New models are affected by the implementation of the formerly developed model.
It is the simplest way of connecting predictions that belong to a similar type.It is a method of connecting predictions that belong to multiple types.
Bagging tries to tackle the over-fitting problem.Boosting tries to reduce bias.
Several training data subsets are randomly drawn with replacement from the whole training dataset.Each new subset includes the components that were misclassified by previous models.
Bagging can solve the over-fitting problem.Boosting can boost the over-fitting problem.

Updated on: 15-Feb-2022

925 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements