Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to plot hexbin histogram in Matplotlib?
A hexbin histogram in Matplotlib displays the distribution of two-dimensional data using hexagonal bins. This visualization is particularly useful for large datasets where traditional scatter plots become cluttered with overlapping points.
Basic Hexbin Plot
The hexbin() method creates a hexagonal binning plot where the color intensity represents the density of data points in each hexagon ?
import numpy as np
import matplotlib.pyplot as plt
# Generate sample data
x = 2 * np.random.randn(5000)
y = x + np.random.randn(5000)
# Create hexbin plot
fig, ax = plt.subplots(figsize=(8, 6))
hb = ax.hexbin(x[::10], y[::10], gridsize=20, cmap='plasma')
ax.set_title('Hexbin Histogram')
ax.set_xlabel('X values')
ax.set_ylabel('Y values')
# Add colorbar to show density scale
plt.colorbar(hb, ax=ax, label='Count')
plt.show()
Customizing Hexbin Parameters
You can customize the appearance by adjusting gridsize, colormap, and extent ?
import numpy as np
import matplotlib.pyplot as plt
# Generate correlated data
np.random.seed(42)
x = np.random.normal(0, 1, 10000)
y = 0.5 * x + np.random.normal(0, 0.8, 10000)
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
# Different grid sizes
grid_sizes = [10, 20, 30]
cmaps = ['Blues', 'viridis', 'plasma']
for i, (gridsize, cmap) in enumerate(zip(grid_sizes, cmaps)):
hb = axes[i].hexbin(x, y, gridsize=gridsize, cmap=cmap)
axes[i].set_title(f'Gridsize: {gridsize}')
plt.colorbar(hb, ax=axes[i])
plt.tight_layout()
plt.show()
Key Parameters
| Parameter | Description | Example Values |
|---|---|---|
gridsize |
Number of hexagons in x-direction | 10, 20, 50 |
cmap |
Colormap for density visualization | 'viridis', 'plasma', 'Blues' |
extent |
Limits of the binning region | [xmin, xmax, ymin, ymax] |
mincnt |
Minimum count threshold | 1, 5, 10 |
Advanced Example with Multiple Datasets
Compare different distributions using hexbin plots with consistent scales ?
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(123)
# Create three different datasets
n_points = 8000
x1 = np.random.normal(0, 1, n_points)
y1 = np.random.normal(0, 1, n_points)
x2 = np.random.normal(2, 0.8, n_points)
y2 = 0.7 * x2 + np.random.normal(0, 0.5, n_points)
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
# First dataset
hb1 = axes[0].hexbin(x1, y1, gridsize=25, cmap='Blues', extent=[-4, 6, -4, 6])
axes[0].set_title('Random Distribution')
plt.colorbar(hb1, ax=axes[0])
# Second dataset
hb2 = axes[1].hexbin(x2, y2, gridsize=25, cmap='Reds', extent=[-4, 6, -4, 6])
axes[1].set_title('Correlated Distribution')
plt.colorbar(hb2, ax=axes[1])
for ax in axes:
ax.set_xlabel('X values')
ax.set_ylabel('Y values')
plt.tight_layout()
plt.show()
Conclusion
Hexbin histograms are ideal for visualizing large 2D datasets where point density matters. Use gridsize to control resolution and cmap to highlight density patterns effectively.
