LightGBM - Early Stopping Training

Quiz

Early stopping training is a method in which we finish training if the evaluation metric assessed on the evaluation dataset does not improve after a particular number of cycles. Lightgbm's sklearn-like estimators have a parameter named early_stopping_rounds in both the train() and fit() methods. This parameter accepts an integer value stating that the training process should be stopped if the evaluation metric result has not improved after a certain number of rounds.

This parameter accepts an integer value which shows that the training process should be terminated if the evaluation metric result does not improve after several rounds.

So keep in mind that this requires an evaluation dataset to work because it relies on evaluation metric results that are assessed against the evaluation dataset.

Example

We will first import the necessary libraries before loading the Boston housing dataset. As of version 1.2 the dataset is no longer available in Scikit-Learn so we will either replicate the feature using sklearn.datasets.load_boston().

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target)

print("Sizes of Train or Test Datasets : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

train_dataset = lgb.Dataset(X_train, Y_train, feature_name=boston.feature_names.tolist())
test_dataset = lgb.Dataset(X_test, Y_test, feature_name=boston.feature_names.tolist())

booster = lgb.train({"objective": "regression", "verbosity": -1, "metric": "rmse"},
                    train_set=train_dataset, valid_sets=(test_dataset,),
                    early_stopping_rounds=5,
                    num_boost_round=100)

from sklearn.metrics import r2_score

test_preds = booster.predict(X_test)
train_preds = booster.predict(X_train)

# Display the R2 scores in the console
print("\nR2 Score on Test Set : %.2f"%r2_score(Y_test, test_preds))
print("R2 Score on Train Set : %.2f"%r2_score(Y_train, train_preds))

Output

This will produce the following result:

Sizes of Train or Test Datasets:  (404, 13) (102, 13) (404,) (102,)
[1]	valid_0's rmse: 9.10722
Training until validation scores don't improve for 5 rounds
[2]	valid_0's rmse: 8.46389
[3]	valid_0's rmse: 7.93394
[4]	valid_0's rmse: 7.43812
[5]	valid_0's rmse: 7.01845
[6]	valid_0's rmse: 6.68186
[7]	valid_0's rmse: 6.43834
[8]	valid_0's rmse: 6.17357
[9]	valid_0's rmse: 5.96725
[10]	valid_0's rmse: 5.74169
[11]	valid_0's rmse: 5.55389
[12]	valid_0's rmse: 5.38595
[13]	valid_0's rmse: 5.24832
[14]	valid_0's rmse: 5.13373
[15]	valid_0's rmse: 5.0457
[16]	valid_0's rmse: 4.96688
[17]	valid_0's rmse: 4.87874
[18]	valid_0's rmse: 4.8246
[19]	valid_0's rmse: 4.75342
[20]	valid_0's rmse: 4.69854
Did not meet early stopping. Best iteration is:
[20]	valid_0's rmse: 4.69854

R2 Score on Test Set: 0.81
R2 Score on Train Set: 0.97

This program divides the breast cancer dataset into two sections like training and testing. It trains a LightGBM model to decide whether a tumor is dangerous or harmless so stopping early if performance fails to improve. Finally it predicts the results for both the test and training sets and computes accuracy of the model.

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(breast_cancer.data, breast_cancer.target)

print("Sizes of Train or Test Datasets : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

booster = lgb.LGBMModel(objective="binary", n_estimators=100, metric="auc")

booster.fit(X_train, Y_train,
            eval_set=[(X_test, Y_test),],
            early_stopping_rounds=3)

from sklearn.metrics import accuracy_score

test_preds = booster.predict(X_test)
train_preds = booster.predict(X_train)

test_preds = [1 if pred > 0.5 else 0 for pred in test_preds]
train_preds = [1 if pred > 0.5 else 0 for pred in train_preds]

# Display the accuracy results
print("\nAccuracy Score on Test Set : %.2f"%accuracy_score(Y_test, test_preds))
print("Accuracy Score on Train Set : %.2f"%accuracy_score(Y_train, train_preds))

Output

This will lead to the following outcome:

Sizes of Train or Test Datasets :  (426, 30) (143, 30) (426,) (143,)
[1]	valid_0's auc: 0.986129
Training until validation scores don't improve for 3 rounds
[2]	valid_0's auc: 0.989355
[3]	valid_0's auc: 0.988925
[4]	valid_0's auc: 0.987097
[5]	valid_0's auc: 0.990108
[6]	valid_0's auc: 0.993011
[7]	valid_0's auc: 0.993011
[8]	valid_0's auc: 0.993441
[9]	valid_0's auc: 0.993441
[10]	valid_0's auc: 0.994194
[11]	valid_0's auc: 0.994194
[12]	valid_0's auc: 0.994194
[13]	valid_0's auc: 0.994409
[14]	valid_0's auc: 0.995914
[15]	valid_0's auc: 0.996129
[16]	valid_0's auc: 0.996989
[17]	valid_0's auc: 0.996989
[18]	valid_0's auc: 0.996344
[19]	valid_0's auc: 0.997204
[20]	valid_0's auc: 0.997419
[21]	valid_0's auc: 0.997849
[22]	valid_0's auc: 0.998065
[23]	valid_0's auc: 0.997849
[24]	valid_0's auc: 0.998065
[25]	valid_0's auc: 0.997634
Early stopping, best iteration is:
[22]	valid_0's auc: 0.998065

Accuracy Score on Test Set : 0.97
Accuracy Score on Train Set : 0.98

How to stop training early through the "early_stopping()" callback?

LightGBM actually supports early-stopping training using the early_stopping() callback mechanism. We can give the number of rounds for the early_stopping() function as a callback argument to the train()/fit() method. Usage of callbacks is given below −

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(breast_cancer.data, breast_cancer.target)

print("Sizes of Train or Test Datasets : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

booster = lgb.LGBMModel(objective="binary", n_estimators=100, metric="auc")

booster.fit(X_train, Y_train,
            eval_set=[(X_test, Y_test),],
            callbacks=[lgb.early_stopping(3)]
            )

from sklearn.metrics import accuracy_score

test_preds = booster.predict(X_test)
train_preds = booster.predict(X_train)

test_preds = [1 if pred > 0.5 else 0 for pred in test_preds]
train_preds = [1 if pred > 0.5 else 0 for pred in train_preds]

print("\nAccuracy Score on Test Set : %.2f"%accuracy_score(Y_test, test_preds))
print("Accuracy Score on Train Set : %.2f"%accuracy_score(Y_train, train_preds))

Output

This will generate the below result:

Sizes of Train or Test Datasets :  (426, 30) (143, 30) (426,) (143,)
[1]	valid_0's auc: 0.954328
Training until validation scores don't improve for 3 rounds
[2]	valid_0's auc: 0.959322
[3]	valid_0's auc: 0.982938
[4]	valid_0's auc: 0.988244
[5]	valid_0's auc: 0.987203
[6]	valid_0's auc: 0.98762
[7]	valid_0's auc: 0.98814
Early stopping, best iteration is:
[4]	valid_0's auc: 0.988244

Accuracy Score on Test Set : 0.94
Accuracy Score on Train Set : 0.95

Print Page