CatBoost - Decision Trees

Quiz

Decision trees are a main part or component of machine learning mainly for classification and regression applications. It works by breaking down the feature space into smaller parts as per the set of rules which creates a tree-like structure with each internal node having a feature-based decision and each leaf node giving the output label or value.

On the other hand, Boosting is a process of ensemble learning that particularly combines multiple weak learners means decision trees to create a strong learner. It mainly focuses on training new models to improve the previous errors which modifies overall predictive performance. Below is the visualization of CatBoost Decision Tree −

Let us see how the tree grows in CatBoost −

Depth-Wise Tree Growth

Depth-wise tree development is also known as level-wise or breadth-first growth. So as name suggests it involves growing trees horizontally until they reach an allocated maximum depth. At each level, the algorithm examines all nodes in the tree and divides them to create new nodes for the next level.

Characteristics of Depth-wise Tree

Here are the characteristics of Depth-Wise tree −

Balanced Tree Growth: To keeps the trees balanced, it grows by splitting at each level (depth by depth). Before going on to the next level each level of the tree is fully established.
Leaf Splitting: CatBoost examines all alternative split at each depth before choosing the best one. It tries to split multiple leaves in each iteration compared to other methods, which only split one at a time.
Efficient Handling of High Dimensions: For maintaining balance the trees will get separated at each level (depth by depth). Each level of the tree is completely set up before proceeding to the next level.
Control Over Depth: CatBoost compares all alternate splits at each depth before selecting the best one. It splits multiple leaves in each iteration, unlike other methods that only split one at a time.
Parallelization: CatBoost's depthwise tree structure allows efficient parallel processing, which leads to more efficient computations, specially for large datasets.

Leaf-Wise Tree Growth

Leaf-wise tree growth is also known as best first or greedy growth. It basically expands trees by dividing them on the most optimal feature and leaf at every step. It selects the best split in all possible splits so the result will be a tree structure with deep branches as compared to depth-wise tree growth.

Characteristics of Leaf-Wise Tree

Here are some characteristics given for leaf-wise tree −

Instead of growing the tree level by level, the algorithm chooses the leaf with the highest error to split next. This method focuses the algorithm on areas with low accuracy and improves them first.
Unlike depthwise growth, leafwise growth can lead to unbalanced trees. Some branches can go deeper than others since the algorithm divides leaves based on where the most loss happens.
By splitting the leaves that contribute the most to the error, the method instantly decreases the overall prediction error. Leaf-wise trees are therefore effective in fitting complex patterns to data.
Because of its ability to grow deeply on some branches, the tree may exceed the training set. Regularization techniques and hyperparameters such as max_depth or l2_leaf_reg can be used to control the maximum depth of the tree so that reducing overfitting.
The growth of the tree is not as balanced or predictable as depth-wise trees, but it can make it more flexible in some cases.

Role of Decision Trees in CatBoost

Decision trees are a key component of many machine learning algorithms, like CatBoost. These are predictive models that use a graph that looks like a tree to map decisions and possible results. In the CatBoost framework, decision trees work as base learners, giving the boosting process a structure.

Using a technique called gradient boosting, CatBoost builds decision trees one after the other to correct errors made by previous trees. As the name "Cat" suggests "categorical," CatBoost is a gradient boosting version that performs better than earlier methods at handling categorical features.

Print Page