Article Categories

Selected Reading

How to plot with xgboost.XGBCClassifier.feature_importances_ model? (Matplotlib)

Python Server Side Programming Programming

The XGBClassifier from XGBoost provides feature importance scores through the feature_importances_ attribute. We can visualize these importance scores using Matplotlib to understand which features contribute most to the model's predictions.

Understanding Feature Importances

Feature importance in XGBoost represents how useful each feature is for making accurate predictions. Higher values indicate more important features in the decision-making process.

Basic Feature Importance Plot

Here's how to create a feature importance plot using synthetic data ?

import numpy as np
from xgboost import XGBClassifier
import matplotlib.pyplot as plt

# Create synthetic dataset
np.random.seed(42)
X = np.random.rand(1000, 8)
y = (X[:, 0] + X[:, 1] > 1).astype(int)  # Simple classification rule

# Train XGBoost model
model = XGBClassifier(random_state=42, eval_metric='logloss')
model.fit(X, y)

# Get feature importances
importances = model.feature_importances_
feature_names = [f'Feature_{i}' for i in range(len(importances))]

# Create bar plot
plt.figure(figsize=(10, 6))
plt.bar(range(len(importances)), importances)
plt.xlabel('Features')
plt.ylabel('Importance Score')
plt.title('XGBoost Feature Importances')
plt.xticks(range(len(importances)), feature_names, rotation=45)
plt.tight_layout()
plt.show()

print("Feature Importances:", importances)

Feature Importances: [0.49586776 0.4958678  0.00413223 0.00413221 0.         0.
 0.         0.        ]

Enhanced Visualization with Feature Names

For better readability, we can sort features by importance and add proper labels ?

import numpy as np
from xgboost import XGBClassifier
import matplotlib.pyplot as plt

# Create synthetic dataset
np.random.seed(42)
X = np.random.rand(500, 6)
y = (X[:, 0] * 2 + X[:, 2] * 0.5 > 1).astype(int)

# Train model
model = XGBClassifier(random_state=42, eval_metric='logloss')
model.fit(X, y)

# Get importances and feature names
importances = model.feature_importances_
features = ['Age', 'Income', 'Score', 'Experience', 'Rating', 'Hours']

# Sort by importance
indices = np.argsort(importances)[::-1]
sorted_features = [features[i] for i in indices]
sorted_importances = importances[indices]

# Create horizontal bar plot
plt.figure(figsize=(10, 6))
colors = plt.cm.viridis(np.linspace(0, 1, len(sorted_importances)))
bars = plt.barh(sorted_features, sorted_importances, color=colors)

plt.xlabel('Importance Score')
plt.title('XGBoost Feature Importances (Sorted)')
plt.gca().invert_yaxis()  # Highest importance at top

# Add value labels on bars
for bar, importance in zip(bars, sorted_importances):
    plt.text(bar.get_width() + 0.01, bar.get_y() + bar.get_height()/2, 
             f'{importance:.3f}', va='center')

plt.tight_layout()
plt.show()

(Horizontal bar chart showing sorted feature importances)

Comparison Table

Plot Type	Best For	Advantages
Vertical Bar	Few features (<10)	Compact, easy comparison
Horizontal Bar	Many features or long names	Better label readability
Sorted Plot	Identifying top features	Clear ranking visualization

Key Parameters

Important considerations when plotting feature importances:

figure.figsize − Controls plot dimensions
rotation − Rotates x-axis labels for readability
tight_layout() − Prevents label cutoff
eval_metric − Suppresses XGBoost warnings

Conclusion

XGBoost feature importances help identify which features contribute most to model predictions. Use horizontal bar plots for better readability with many features, and always sort by importance to highlight the most influential variables.

Rishikesh Kumar Rishi

Updated on: 2026-03-25T23:45:55+05:30

480 Views

Previous Next