Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Plotting profile histograms in Python Matplotlib
A profile histogram displays the mean value of y for each bin of x values, making it useful for visualizing relationships between variables. Python provides several approaches to create profile histograms using Matplotlib and Seaborn.
Using Seaborn regplot()
The regplot() method from Seaborn can create profile histograms by binning x values and showing mean y values ?
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# Set figure size
plt.rcParams["figure.figsize"] = [7.00, 3.50]
plt.rcParams["figure.autolayout"] = True
# Generate sample data
x = np.random.uniform(-5, 5, 1000)
y = np.random.normal(x**2, np.abs(x) + 1)
# Create profile histogram
sns.regplot(x=x, y=y, x_bins=20, marker='o', fit_reg=True)
plt.title('Profile Histogram using Seaborn')
plt.xlabel('X Values')
plt.ylabel('Mean Y Values')
plt.show()
Using Matplotlib binned_statistic()
For more control, use scipy.stats.binned_statistic() to calculate bin means manually ?
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binned_statistic
# Generate sample data
x = np.random.uniform(-3, 3, 500)
y = x**2 + np.random.normal(0, 0.5, 500)
# Calculate bin means
bin_means, bin_edges, _ = binned_statistic(x, y, statistic='mean', bins=15)
bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
# Plot profile histogram
plt.figure(figsize=(8, 4))
plt.plot(bin_centers, bin_means, 'bo-', linewidth=2, markersize=6)
plt.title('Profile Histogram using binned_statistic()')
plt.xlabel('X Values')
plt.ylabel('Mean Y Values')
plt.grid(True, alpha=0.3)
plt.show()
Parameters
| Parameter | Description | Example |
|---|---|---|
x_bins |
Number of bins for x values | x_bins=20 |
fit_reg |
Show regression line | fit_reg=True |
statistic |
Aggregation function | statistic='mean' |
Key Differences
regplot() automatically handles binning and provides regression fitting, while binned_statistic() offers more control over the binning process and visualization style.
Conclusion
Use sns.regplot() for quick profile histograms with regression lines. Use binned_statistic() for custom binning control and detailed visualization options.
