
- XGBoost - Home
- XGBoost - Overview
- XGBoost - Architecture
- XGBoost - Installation
- XGBoost - Hyper-parameters
- XGBoost - Tuning with Hyper-parameters
- XGBoost - Using DMatrix
- XGBoost - Classification
- XGBoost - Regressor
- XGBoost - Regularization
- XGBoost - Learning to Rank
- XGBoost - Over-fitting Control
- XGBoost - Quantile Regression
- XGBoost - Bootstrapping Approach
- XGBoost - Python Implementation
- XGBoost vs Other Boosting Algorithms
- XGBoost Useful Resources
- XGBoost - Quick Guide
- XGBoost - Useful Resources
- XGBoost - Discussion
XGBoost - Hyperparameters
In this chapter we are going to discuss the subset of hyperparameters needed or commonly used XGBoost algorithm. These parameters have been selected to simplify the process of generating model parameters from data. The required hyperparameters that need to be configured are listed in this chapter category wise. The hyperparameters that are settable and optional.
XGBoost Hyperparameters Categories
The overall hyperparameters have been divided into three main categories by XGBoost creators −
General Parameters
Booster Parameters
Learning Task Parameters
Let us discuss these three categories of Hyperparameters in the below section −
General Parameters
The general parameters define the overall functionality and working of the XGBoost model. Here is the list of parameters comes in this category −
booster [default=gbtree]: This parameter basically selects the type of model to run at each iteration. It gives 2 options - gbtree: tree-based models and gblinear: linear models.
silent [default=0]: It is used to set the model in silent mode. If it is activated and set to 1, means, no running messages will be printed. It is good to keep it 0 because the messages can help in understanding the model.
nthread [default to the maximum number of threads available]: It is mainly used for parallel processing and the number of cores in the system should be entered. If you want to run on all cores so the value will not be entered and the algorithm will detect it automatically.
There are two other parameters that XGBoost automatically sets so you do not need to worry about them.
Booster Parameters
Since there are two types of boosters, Here we will only discuss the tree booster because it is less frequently used than the linear booster and consistently performs better.
Parameter | Description | Typical Values |
---|---|---|
eta | Like the learning speed. Helps control how much the model changes after each step. | 0.01-0.2 |
min_child_weight | The smallest total weight of all observations required in a tree's node. | Tune with cross-validation |
max_depth | The deepest level of a tree. Controls overfitting (model being too specific). | 3-10 |
max_leaf_nodes | The most leaves (end points) a tree can have. | |
gamma | The smallest amount the loss needs to decrease to split a node. | Tune based on loss function |
max_delta_step | Limits how much a tree's weight can change. | Usually not needed |
subsample | The fraction of the data used to grow each tree. | 0.5-1 |
colsample_bytree | The fraction of columns (features) randomly chosen for each tree. | 0.5-1 |
colsample_bylevel | The fraction of columns used for each split at every level of the tree. | Usually not used |
lambda | L2 regularization (like Ridge regression), helps reduce overfitting. | Try to reduce overfitting |
alpha | L1 regularization (like Lasso regression), useful for models with many features. | Good for high-dimensional data |
scale_pos_weight | Helps with imbalanced data classes to make the model learn faster. | > 0 (for imbalanced data) |
Learning Task Parameters
The learning task parameters define the goal of optimization and the metric that will be chosen at each step.
objective [default=reg:linear]
It is used to define the loss function to be minimized. And mostly used values are as follows −
binary: logistic - Refers to binary classification, because there are two classifications. It returns the expected probability instead of the actual class.
multi: softmax - It is used for multiclass classification. It returns the expected class instead of the probability. You also need to set the additional option num_class in order to tell the model how many unique classes are there.
multi: softprob - It is a function that is comparable to softmax in that it provides the probabilities for every possible class that a data point can belong to, instead of only the class that is predicted.
eval_metric [ default according to objective ]
Evaluation metrics must be used with the validation data. The default parameters is rmse used for error classification and regression.
The typical values are as follows −
rmse: root mean square error
mae: mean absolute error
logloss: negative log-likelihood
error: Binary classification error rate (0.5 thresholds)
merror: Multiclass classification error rate
mlogloss: Multiclass logloss
auc: Area under the curve
seed [default=0]
It is the random number seed. And it is used for generating reproducible results and also for parameter tuning.
Those of you who never used Scikit-Learn before are unlikely to recognize these parameter names. However, there is a sklearn wrapper for the Python xgboost package called XGBClassifier parameters. It follows the naming convention of the sklearn style. The names of the parameters that will alter are:
eta -> learning_rate
lambda -> reg_lambda
alpha -> reg_alpha