Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to appropriately plot the losses values acquired by (loss_curve_) from MLPClassifier? (Matplotlib)
The MLPClassifier from scikit-learn provides a loss_curve_ attribute that tracks training loss at each iteration. Plotting these values helps visualize training convergence across different hyperparameters and datasets.
Understanding MLPClassifier Loss Curves
The loss_curve_ attribute stores the loss function value after each iteration during training. By plotting these values, we can compare how different solvers and learning rates affect convergence behavior.
Complete Example
Here's how to plot loss curves for different MLPClassifier configurations across multiple datasets ?
import warnings
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn import datasets
from sklearn.exceptions import ConvergenceWarning
plt.rcParams["figure.figsize"] = [12, 8]
plt.rcParams["figure.autolayout"] = True
# Define different hyperparameter configurations
params = [
{'solver': 'sgd', 'learning_rate': 'constant', 'momentum': 0, 'learning_rate_init': 0.2},
{'solver': 'sgd', 'learning_rate': 'constant', 'momentum': .9, 'nesterovs_momentum': False, 'learning_rate_init': 0.2},
{'solver': 'sgd', 'learning_rate': 'constant', 'momentum': .9, 'nesterovs_momentum': True, 'learning_rate_init': 0.2},
{'solver': 'sgd', 'learning_rate': 'invscaling', 'momentum': 0, 'learning_rate_init': 0.2},
{'solver': 'adam', 'learning_rate_init': 0.01}
]
labels = [
"constant learning-rate",
"constant with momentum",
"constant with Nesterov's momentum",
"inv-scaling learning-rate",
"adam"
]
plot_args = [
{'c': 'red', 'linestyle': '-'},
{'c': 'green', 'linestyle': '-'},
{'c': 'blue', 'linestyle': '-'},
{'c': 'orange', 'linestyle': '--'},
{'c': 'black', 'linestyle': '-'}
]
def plot_on_dataset(X, y, ax, name):
ax.set_title(f'Loss Curves - {name.title()} Dataset')
ax.set_xlabel('Iterations')
ax.set_ylabel('Loss')
# Scale features for better convergence
X = MinMaxScaler().fit_transform(X)
# Adjust iterations based on dataset complexity
max_iter = 15 if name == "digits" else 200
mlps = []
for label, param in zip(labels, params):
mlp = MLPClassifier(random_state=0, max_iter=max_iter, **param)
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=ConvergenceWarning, module="sklearn")
mlp.fit(X, y)
mlps.append(mlp)
# Plot loss curves
for mlp, label, args in zip(mlps, labels, plot_args):
ax.plot(mlp.loss_curve_, label=label, **args)
ax.legend()
ax.grid(True, alpha=0.3)
# Create subplots for different datasets
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
# Load datasets
iris = datasets.load_iris()
X_digits, y_digits = datasets.load_digits(return_X_y=True)
data_sets = [
(iris.data, iris.target),
(X_digits, y_digits),
datasets.make_circles(noise=0.2, factor=0.5, random_state=1),
datasets.make_moons(noise=0.3, random_state=0)
]
dataset_names = ['iris', 'digits', 'circles', 'moons']
# Plot loss curves for each dataset
for ax, data, name in zip(axes.ravel(), data_sets, dataset_names):
plot_on_dataset(*data, ax=ax, name=name)
plt.tight_layout()
plt.show()
[Displays a 2x2 grid of plots showing loss curves for different MLPClassifier configurations across iris, digits, circles, and moons datasets]
Key Components
| Component | Purpose | Description |
|---|---|---|
loss_curve_ |
Training monitoring | List of loss values per iteration |
MinMaxScaler |
Feature scaling | Normalizes features for better convergence |
max_iter |
Training control | Maximum iterations before stopping |
| Different solvers | Optimization | SGD vs Adam optimization algorithms |
Analyzing the Results
The loss curves reveal important training characteristics ?
- Adam solver typically shows smooth, consistent convergence
- SGD with momentum can converge faster but may be more unstable
- Learning rate schedules like "invscaling" show gradual loss reduction
- Dataset complexity affects convergence speed and final loss values
Conclusion
Plotting loss_curve_ from MLPClassifier helps visualize training progress and compare different hyperparameter configurations. Use this technique to select optimal solver and learning rate combinations for your specific dataset.
