# What are the approaches to Tree Pruning?

Pruning is the procedure that decreases the size of decision trees. It can decrease the risk of overfitting by defining the size of the tree or eliminating areas of the tree that support little power. Pruning supports by trimming the branches that follow anomalies in the training information because of noise or outliers and supports the original tree in a method that enhances the generalization efficiency of the tree.

Various methods generally use statistical measures to delete the least reliable departments, frequently resulting in quicker classification and an improvement in the capability of the tree to properly classify independent test data.

There are two approaches to tree pruning which are as follows −

## Pre-pruning Approach

In the pre-pruning approach, a tree is “pruned” by labored its construction early (e.g., by determining not to further divide or partition the subset of training samples at a provided node). Upon halting, the node turns into a leaf. The leaf can influence the most common class between the subset samples, or the probability distribution of those samples.

When making a tree, measures including statistical significance,x2, information gain, etc., can be used to create the generosity of a split. If partitioning the samples at a node can result in a split that declines below a pre-specified threshold, then partitioning of the given subset is halted. There are problems in selecting an appropriate threshold. High thresholds can result in oversimplified trees, while low thresholds can result in very little simplification.

## Post-pruning Approach

The post-pruning approach eliminates branches from a “completely grown” tree. A tree node is pruned by eliminating its branches. The price complexity pruning algorithm is an instance of the post-pruning approach. The pruned node turns into a leaf and is labeled by the most common class between its previous branches.

For each non-leaf node in the tree, the algorithm computes the expected error rate that can appear if the subtree at that node were shortened. Next, the expected error rate appearing if the node were not pruned is computed using the error rates for each branch, connected by weighting according to the dimension of observations along each branch. If pruning the node leads to a higher expected error rate, then the subtree is preserved. Therefore, it is pruned.

After creating a set of increasingly pruned trees, an independent test set can estimate the efficiency of each tree. The decision tree that diminishes the expected error cost is preferred.