How to plot with xgboost.XGBCClassifier.feature_importances_ model? (Matplotlib)

The XGBClassifier from XGBoost provides feature importance scores through the feature_importances_ attribute. We can visualize these importance scores using Matplotlib to understand which features contribute most to the model's predictions.

Understanding Feature Importances

Feature importance in XGBoost represents how useful each feature is for making accurate predictions. Higher values indicate more important features in the decision-making process.

Basic Feature Importance Plot

Here's how to create a feature importance plot using synthetic data ?

import numpy as np
from xgboost import XGBClassifier
import matplotlib.pyplot as plt

# Create synthetic dataset
np.random.seed(42)
X = np.random.rand(1000, 8)
y = (X[:, 0] + X[:, 1] > 1).astype(int)  # Simple classification rule

# Train XGBoost model
model = XGBClassifier(random_state=42, eval_metric='logloss')
model.fit(X, y)

# Get feature importances
importances = model.feature_importances_
feature_names = [f'Feature_{i}' for i in range(len(importances))]

# Create bar plot
plt.figure(figsize=(10, 6))
plt.bar(range(len(importances)), importances)
plt.xlabel('Features')
plt.ylabel('Importance Score')
plt.title('XGBoost Feature Importances')
plt.xticks(range(len(importances)), feature_names, rotation=45)
plt.tight_layout()
plt.show()

print("Feature Importances:", importances)
Feature Importances: [0.49586776 0.4958678  0.00413223 0.00413221 0.         0.
 0.         0.        ]

Enhanced Visualization with Feature Names

For better readability, we can sort features by importance and add proper labels ?

import numpy as np
from xgboost import XGBClassifier
import matplotlib.pyplot as plt

# Create synthetic dataset
np.random.seed(42)
X = np.random.rand(500, 6)
y = (X[:, 0] * 2 + X[:, 2] * 0.5 > 1).astype(int)

# Train model
model = XGBClassifier(random_state=42, eval_metric='logloss')
model.fit(X, y)

# Get importances and feature names
importances = model.feature_importances_
features = ['Age', 'Income', 'Score', 'Experience', 'Rating', 'Hours']

# Sort by importance
indices = np.argsort(importances)[::-1]
sorted_features = [features[i] for i in indices]
sorted_importances = importances[indices]

# Create horizontal bar plot
plt.figure(figsize=(10, 6))
colors = plt.cm.viridis(np.linspace(0, 1, len(sorted_importances)))
bars = plt.barh(sorted_features, sorted_importances, color=colors)

plt.xlabel('Importance Score')
plt.title('XGBoost Feature Importances (Sorted)')
plt.gca().invert_yaxis()  # Highest importance at top

# Add value labels on bars
for bar, importance in zip(bars, sorted_importances):
    plt.text(bar.get_width() + 0.01, bar.get_y() + bar.get_height()/2, 
             f'{importance:.3f}', va='center')

plt.tight_layout()
plt.show()
(Horizontal bar chart showing sorted feature importances)

Comparison Table

Plot Type Best For Advantages
Vertical Bar Few features (<10) Compact, easy comparison
Horizontal Bar Many features or long names Better label readability
Sorted Plot Identifying top features Clear ranking visualization

Key Parameters

Important considerations when plotting feature importances:

  • figure.figsize − Controls plot dimensions
  • rotation − Rotates x-axis labels for readability
  • tight_layout() − Prevents label cutoff
  • eval_metric − Suppresses XGBoost warnings

Conclusion

XGBoost feature importances help identify which features contribute most to model predictions. Use horizontal bar plots for better readability with many features, and always sort by importance to highlight the most influential variables.

Updated on: 2026-03-25T23:45:55+05:30

426 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements