XGBoost - Tuning with Hyperparameters

Quiz

In this chapter, we will talk about the crucial problem of XGBoost model hyperparameter adjustment. Hyperparameters are specific numbers or weights that control how an algorithm learns. As we have already seen in the previous chapter XGBoost offers a wide range of hyperparameters. By modifying the hyperparameters of XGBoost we are able to maximize its efficiency. XGBoost is known for its ability to automatically tune thousands of learnable parameters in order to find patterns and regularities in the data.

The decision variables selected at each node are the learnable parameters of tree-based models like XGBoost. Large hyperparameters will result from the increased number of design decisions. These are the parameters that the algorithm is trained with and remains fixed.

Hyperparameters in tree-based models include the maximum tree depth, the number of trees to grow, the number of variables to consider when building each tree, the minimum number of leaf samples, and the fraction of observations used to build a tree. But the focus of this chapter is on maximizing XGBoost hyperparameters, the techniques covered here are applicable to any other advanced ML method.

Tuning XGBoost with Hyperparameters

Now we will see how we can tune our XGBoost model with the help of hyperparameters −

1. Import libraries

First you need to import all the necessary libraries as per the below code −

# Import pandas for handling data
import pandas as pd

# Import numpy for scientific calculations
import numpy as np

# Import XGBoost for machine learning
import xgboost as xgb
from sklearn.metrics import accuracy_score

# Import libraries for tuning hyperparameters
from hyperopt import STATUS_OK, Trials, fmin, hp, tpe

2. Read dataset

Now we will read our dataset. Here we are using Wholesale-customers-data.csv dataset.

data = '/Python/Wholesale customers data.csv'

df = pd.read_csv(data)

3. Declare feature vector and target variables

Here we need to declare the feature vector and target variables now −

X = df.drop('Channel', axis=1)

y = df['Channel']

Now let us take a look at feature vector(X) and target variable(y).

X.head()
y.head()

Output

Here is the outcome of the above step −

0    2
1    2
2    2
3    1
4    2
Name: Channel, dtype: int64

We can see that the y label has values as 1 and 2. We will have to convert it into 0 and 1 for further analysis. So we will do it as follows -

# Change labels into binary values

y[y == 2] = 0

y[y == 1] = 1

And again we will check the y label −

# Now again see the y label

y.head()

Here is the outcome of the above section −

0    0
1    0
2    0
3    1
4    0
Name: Channel, dtype: int64

So you can look here that our target variable (y) is converted into 0 and 1.

4. Split data into separate training and test set

Now we are going to split the above data in the separate training and testing sets. Do it as follows −

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)

Bayesian Optimization with HYPEROPT

Bayesian optimization is the process that finds the optimal parameter for a machine learning or deep learning algorithm. Optimization is the process of determining the lowest cost function that results in a model's overall better performance on both the train and test sets.

In this approach, we will train the model with a variety of parameter ranges until we find the best fit. Hyperparameter tuning helps in finding the optimal tuned parameters and returning the best fit model, which is the best approach to follow when building an ML or DL algorithm.

This chapter discusses one of the most precise and successful hyperparameter tuning methods, Bayesian Optimization with HYPEROPT.

What is HYPEROPT ?

HYPEROPT is an advanced Python package that searches over a hyperparameter space of values to find the best options that minimize the loss function.

The Bayesian Optimization approach use Hyperopt to tune the model hyperparameters. Hyperopt is a Python library for tuning model hyperparameters.

Bayesian Optimization Implementation

The optimization process has 4 parts Initialize domain space, Define objective function, Optimization algorithm and Results. So let us discuss these parts one by one here −

1. Initialize domain space

The domain space refers to the input values for which we want to search. Here is the code you can see −

# Set up hyperparameters for tuning using Hyperopt
space = {
    'max_depth': hp.quniform('max_depth', 3, 10, 1),
    'learning_rate': hp.uniform('learning_rate', 0.01, 0.2),
    'n_estimators': hp.quniform('n_estimators', 50, 300, 50),
    'subsample': hp.uniform('subsample', 0.5, 1),
    'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1),
    'gamma': hp.uniform('gamma', 0, 0.5),
    'lambda': hp.uniform('lambda', 0, 1),
    'alpha': hp.uniform('alpha', 0, 1)
}

2. Define objective function

The objective function is any function that generates a real value that we want to minimize. In this case, we are focusing to reduce the validation error of a ML model with respect to its hyperparameters. If accuracy is truly valuable, we need to maximize it. The code should then return the metric's negative value.

# Define objective function for hyperparameter tuning
def objective(space):
    clf=xgb.XGBClassifier(
                    n_estimators =space['n_estimators'], max_depth = int(space['max_depth']), gamma = space['gamma'],
                    reg_alpha = int(space['reg_alpha']),min_child_weight=int(space['min_child_weight']),
                    colsample_bytree=int(space['colsample_bytree']))
    
    evaluation = [( X_train, y_train), ( X_test, y_test)]
    
    clf.fit(X_train, y_train,
            eval_set=evaluation, eval_metric="auc",
            early_stopping_rounds=10,verbose=False)
    

    pred = clf.predict(X_test)
    accuracy = accuracy_score(y_test, pred>0.5)
    print ("SCORE:", accuracy)
    return {'loss': -accuracy, 'status': STATUS_OK }

3. Optimization algorithm

It is the procedure for building the surrogate objective function and selecting the next values to evaluate.

# Run Hyperopt to find the best hyperparameters
trials = Trials()
best = fmin(
   fn=objective, 
   space=space, 
   algo=tpe.suggest, 
   max_evals=50, 
   trials=trials
)

4. Print Results

The results are score or value pairs used by the algorithm to build the model.

# Print the best hyperparameters
print("Best Hyperparameters:", best)

Output

Best Hyperparameters: {'alpha': 0.221612523499914, 'colsample_bytree': 0.7560822278126258, 'gamma': 0.05019667254058424, 'lambda': 0.3047164013099425, 'learning_rate': 0.019578072539274467, 'max_depth': 9.0, 'n_estimators': 150.0, 'subsample': 0.7674996723810256}

Print Page