
- XGBoost - Home
- XGBoost - Overview
- XGBoost - Architecture
- XGBoost - Installation
- XGBoost - Hyper-parameters
- XGBoost - Tuning with Hyper-parameters
- XGBoost - Using DMatrix
- XGBoost - Classification
- XGBoost - Regressor
- XGBoost - Regularization
- XGBoost - Learning to Rank
- XGBoost - Over-fitting Control
- XGBoost - Quantile Regression
- XGBoost - Bootstrapping Approach
- XGBoost - Python Implementation
- XGBoost vs Other Boosting Algorithms
- XGBoost Useful Resources
- XGBoost - Quick Guide
- XGBoost - Useful Resources
- XGBoost - Discussion
XGBoost - Tuning with Hyperparameters
In this chapter, we will talk about the crucial problem of XGBoost model hyperparameter adjustment. Hyperparameters are specific numbers or weights that control how an algorithm learns. As we have already seen in the previous chapter XGBoost offers a wide range of hyperparameters. By modifying the hyperparameters of XGBoost we are able to maximize its efficiency. XGBoost is known for its ability to automatically tune thousands of learnable parameters in order to find patterns and regularities in the data.
The decision variables selected at each node are the learnable parameters of tree-based models like XGBoost. Large hyperparameters will result from the increased number of design decisions. These are the parameters that the algorithm is trained with and remains fixed.
Hyperparameters in tree-based models include the maximum tree depth, the number of trees to grow, the number of variables to consider when building each tree, the minimum number of leaf samples, and the fraction of observations used to build a tree. But the focus of this chapter is on maximizing XGBoost hyperparameters, the techniques covered here are applicable to any other advanced ML method.
Tuning XGBoost with Hyperparameters
Now we will see how we can tune our XGBoost model with the help of hyperparameters −
1. Import libraries
First you need to import all the necessary libraries as per the below code −
# Import pandas for handling data import pandas as pd # Import numpy for scientific calculations import numpy as np # Import XGBoost for machine learning import xgboost as xgb from sklearn.metrics import accuracy_score # Import libraries for tuning hyperparameters from hyperopt import STATUS_OK, Trials, fmin, hp, tpe
2. Read dataset
Now we will read our dataset. Here we are using Wholesale-customers-data.csv dataset.
data = '/Python/Wholesale customers data.csv' df = pd.read_csv(data)
3. Declare feature vector and target variables
Here we need to declare the feature vector and target variables now −
X = df.drop('Channel', axis=1) y = df['Channel']
Now let us take a look at feature vector(X) and target variable(y).
X.head() y.head()
Output
Here is the outcome of the above step −
0 2 1 2 2 2 3 1 4 2 Name: Channel, dtype: int64
We can see that the y label has values as 1 and 2. We will have to convert it into 0 and 1 for further analysis. So we will do it as follows -
# Change labels into binary values y[y == 2] = 0 y[y == 1] = 1
And again we will check the y label −
# Now again see the y label y.head()
Here is the outcome of the above section −
0 0 1 0 2 0 3 1 4 0 Name: Channel, dtype: int64
So you can look here that our target variable (y) is converted into 0 and 1.
4. Split data into separate training and test set
Now we are going to split the above data in the separate training and testing sets. Do it as follows −
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)
Bayesian Optimization with HYPEROPT
Bayesian optimization is the process that finds the optimal parameter for a machine learning or deep learning algorithm. Optimization is the process of determining the lowest cost function that results in a model's overall better performance on both the train and test sets.
In this approach, we will train the model with a variety of parameter ranges until we find the best fit. Hyperparameter tuning helps in finding the optimal tuned parameters and returning the best fit model, which is the best approach to follow when building an ML or DL algorithm.
This chapter discusses one of the most precise and successful hyperparameter tuning methods, Bayesian Optimization with HYPEROPT.
What is HYPEROPT ?
HYPEROPT is an advanced Python package that searches over a hyperparameter space of values to find the best options that minimize the loss function.
The Bayesian Optimization approach use Hyperopt to tune the model hyperparameters. Hyperopt is a Python library for tuning model hyperparameters.
Bayesian Optimization Implementation
The optimization process has 4 parts Initialize domain space, Define objective function, Optimization algorithm and Results. So let us discuss these parts one by one here −
1. Initialize domain space
The domain space refers to the input values for which we want to search. Here is the code you can see −
# Set up hyperparameters for tuning using Hyperopt space = { 'max_depth': hp.quniform('max_depth', 3, 10, 1), 'learning_rate': hp.uniform('learning_rate', 0.01, 0.2), 'n_estimators': hp.quniform('n_estimators', 50, 300, 50), 'subsample': hp.uniform('subsample', 0.5, 1), 'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1), 'gamma': hp.uniform('gamma', 0, 0.5), 'lambda': hp.uniform('lambda', 0, 1), 'alpha': hp.uniform('alpha', 0, 1) }
2. Define objective function
The objective function is any function that generates a real value that we want to minimize. In this case, we are focusing to reduce the validation error of a ML model with respect to its hyperparameters. If accuracy is truly valuable, we need to maximize it. The code should then return the metric's negative value.
# Define objective function for hyperparameter tuning def objective(space): clf=xgb.XGBClassifier( n_estimators =space['n_estimators'], max_depth = int(space['max_depth']), gamma = space['gamma'], reg_alpha = int(space['reg_alpha']),min_child_weight=int(space['min_child_weight']), colsample_bytree=int(space['colsample_bytree'])) evaluation = [( X_train, y_train), ( X_test, y_test)] clf.fit(X_train, y_train, eval_set=evaluation, eval_metric="auc", early_stopping_rounds=10,verbose=False) pred = clf.predict(X_test) accuracy = accuracy_score(y_test, pred>0.5) print ("SCORE:", accuracy) return {'loss': -accuracy, 'status': STATUS_OK }
3. Optimization algorithm
It is the procedure for building the surrogate objective function and selecting the next values to evaluate.
# Run Hyperopt to find the best hyperparameters trials = Trials() best = fmin( fn=objective, space=space, algo=tpe.suggest, max_evals=50, trials=trials )
4. Print Results
The results are score or value pairs used by the algorithm to build the model.
# Print the best hyperparameters print("Best Hyperparameters:", best)
Output
Best Hyperparameters: {'alpha': 0.221612523499914, 'colsample_bytree': 0.7560822278126258, 'gamma': 0.05019667254058424, 'lambda': 0.3047164013099425, 'learning_rate': 0.019578072539274467, 'max_depth': 9.0, 'n_estimators': 150.0, 'subsample': 0.7674996723810256}