Plotting histograms against classes in Pandas / Matplotlib

To plot histograms against classes in Pandas/Matplotlib, we can use the hist() method to visualize the distribution of values across different columns (classes) in a DataFrame. This is useful for comparing data distributions side by side.

Basic Histogram Plotting

Here's how to create histograms for multiple columns in a DataFrame ?

import matplotlib.pyplot as plt
import pandas as pd

# Set figure size for better visualization
plt.rcParams["figure.figsize"] = [10, 6]
plt.rcParams["figure.autolayout"] = True

# Create a sample DataFrame with different classes
df = pd.DataFrame({
    'Class_A': [1, 2, 2, 3, 4, 2, 3, 1, 4, 2],
    'Class_B': [2, 3, 1, 4, 2, 3, 1, 4, 2, 3],
    'Class_C': [1, 1, 3, 3, 4, 4, 2, 2, 3, 1],
    'Class_D': [3, 2, 4, 1, 3, 2, 4, 1, 2, 3]
})

# Plot histograms for all columns
df.hist(bins=4, alpha=0.7)
plt.suptitle('Histograms for Different Classes')
plt.show()

Customizing Histogram Appearance

You can customize the histogram appearance with different parameters ?

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Create sample data with more variation
np.random.seed(42)
df = pd.DataFrame({
    'Score_Math': np.random.normal(75, 15, 100),
    'Score_Science': np.random.normal(80, 12, 100),
    'Score_English': np.random.normal(70, 18, 100)
})

# Plot customized histograms
df.hist(bins=15, figsize=(12, 8), color=['skyblue', 'lightgreen', 'lightcoral'])
plt.suptitle('Student Scores Distribution by Subject', fontsize=16)
plt.tight_layout()
plt.show()

Plotting Histograms by Categorical Classes

When you have categorical data, you can group by classes and plot histograms ?

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Create DataFrame with categorical classes
np.random.seed(123)
data = {
    'values': np.concatenate([
        np.random.normal(50, 10, 50),  # Group A
        np.random.normal(70, 15, 50),  # Group B
        np.random.normal(60, 12, 50)   # Group C
    ]),
    'category': ['A']*50 + ['B']*50 + ['C']*50
}
df = pd.DataFrame(data)

# Plot histogram for each category
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for i, category in enumerate(['A', 'B', 'C']):
    subset = df[df['category'] == category]
    axes[i].hist(subset['values'], bins=10, alpha=0.7, 
                color=['red', 'green', 'blue'][i])
    axes[i].set_title(f'Category {category}')
    axes[i].set_xlabel('Values')
    axes[i].set_ylabel('Frequency')

plt.tight_layout()
plt.show()

Key Parameters

Parameter Description Example
bins Number of histogram bins bins=10
alpha Transparency level (0-1) alpha=0.7
figsize Figure dimensions (width, height) figsize=(10, 6)
color Colors for each histogram color=['red', 'blue']

Conclusion

Use df.hist() to quickly create histograms for all DataFrame columns. For categorical data, group by classes first and plot separate histograms. Customize with parameters like bins, alpha, and color for better visualization.

---
Updated on: 2026-03-25T21:42:55+05:30

321 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements