LightGBM - Feature Interaction Constraints



When lightgbm has finished training the ensemble trees on a dataset, each node denotes a condition determined by a feature value. When making predictions with an individual tree, we start at the root node and compare the feature condition given in the node to our sample feature values. We make decisions as per the feature values in our sample and the condition of the tree. This allows us to generate the final prediction by taking a specific path to the leaf of the tree. By default there are no limitations on which nodes can have which capability.

This method of generating a final decision by iterating over nodes of a tree and analyzing feature condition is known as feature interaction, because the predictor arrived to the specific node after evaluating the state of the previous one. LightGBM allows us to decide which features can interact with each other. We can define a set of indices and only those qualities will interact with one another. These features will be unable to interact with other features and this limitation will be enforced when trees are generated during the training phase.

We have shown how to force a feature interaction constraints on an estimator in LightGBM. LightGBM estimators have a parameter called interaction_constraints which accepts a list of lists each containing indices of parameters that can interact with one another.

Example 1

Here is an example of how we can force Feature Interaction Constraint on estimator in lightgbm.

The load_boston function from sklearn.datasets may be deprecated in some versions of scikit-learn. If any error occurs than you can load the dataset from an external source or use an alternative dataset.

# Import necessary libraries
import lightgbm as lgb
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

# Load the Boston housing dataset
boston = load_boston()

# Split the data into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target, train_size=0.90, random_state=42)

# Print the size of the training and testing sets
print("Sizes of Train or Test Datasets : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape, "\n")

# Create LightGBM datasets
train_dataset = lgb.Dataset(X_train, Y_train, feature_name=boston.feature_names.tolist())
test_dataset = lgb.Dataset(X_test, Y_test, feature_name=boston.feature_names.tolist())

# Train the LightGBM model
booster = lgb.train({
    "objective": "regression",
    "verbosity": -1,
    "metric": "rmse",
    'interaction_constraints': [[0,1,2,11,12], [3,4], [6,10], [5,9], [7,8]]
    },
    train_set=train_dataset,
    valid_sets=(test_dataset,),
    num_boost_round=10
)

# Make predictions
test_preds = booster.predict(X_test)
train_preds = booster.predict(X_train)

# Calculate and print R2 scores
print("\nR2 Test Score : %.2f" % r2_score(Y_test, test_preds))
print("R2 Train Score : %.2f" % r2_score(Y_train, train_preds))

Output

This will generate the below result:

Sizes of Train or Test Datasets :  (455, 13) (51, 13) (455,) (51,)

[1]	valid_0's rmse: 7.50225
[2]	valid_0's rmse: 7.01989
[3]	valid_0's rmse: 6.58246
[4]	valid_0's rmse: 6.18581
[5]	valid_0's rmse: 5.83873
[6]	valid_0's rmse: 5.47166
[7]	valid_0's rmse: 5.19667
[8]	valid_0's rmse: 4.96259
[9]	valid_0's rmse: 4.69168
[10]	valid_0's rmse: 4.51653

R2 Test Score : 0.67
R2 Train Score : 0.69

Example 2

Now the below code trains a LightGBM model to predict housing prices using the Boston dataset. After training it will calculate how well the model works on both the training and test data using the R2 score.

# Import necessary libraries
import lightgbm as lgb
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

# Load the Boston housing dataset
boston = load_boston()

# Split the dataset into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=42)

# Print the size of the training and testing sets
print("Sizes of Training and Testing Datasets : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape)

# Create a LightGBM model with interaction constraints and 10 estimators
booster = lgb.LGBMModel(objective="regression", n_estimators=10,
                        interaction_constraints=[[0,1,2,11,12], [3, 4], [6,10], [5,9], [7,8]])

# Train the model on the training set and validate it on the test set
booster.fit(X_train, Y_train,
            eval_set=[(X_test, Y_test)],
            eval_metric="rmse")

# Make predictions on both the test and training sets
test_preds = booster.predict(X_test)
train_preds = booster.predict(X_train)

# Calculate and print the R2 score for the test and training sets
print("\nR2 Test Score : %.2f" % r2_score(Y_test, test_preds))
print("R2 Train Score : %.2f" % r2_score(Y_train, train_preds))

Output

This will create the following result:

Sizes of Training and Testing Datasets :  (379, 13) (127, 13) (379,) (127,)
[1]	valid_0's rmse: 8.97871	valid_0's l2: 80.6173
[2]	valid_0's rmse: 8.35545	valid_0's l2: 69.8135
[3]	valid_0's rmse: 7.93432	valid_0's l2: 62.9535
[4]	valid_0's rmse: 7.61104	valid_0's l2: 57.9279
[5]	valid_0's rmse: 7.16832	valid_0's l2: 51.3849
[6]	valid_0's rmse: 6.93182	valid_0's l2: 48.0501
[7]	valid_0's rmse: 6.57728	valid_0's l2: 43.2606
[8]	valid_0's rmse: 6.41497	valid_0's l2: 41.1518
[9]	valid_0's rmse: 6.13983	valid_0's l2: 37.6976
[10]	valid_0's rmse: 5.9864	valid_0's l2: 35.837

R2 Test Score : 0.60
R2 Train Score : 0.69
Advertisements