Python Pandas - Draw a boxplot and display the datapoints on top of boxes by plotting Swarm plot with Seaborn

A box plot shows the distribution of data through quartiles, while a swarm plot displays individual data points without overlap. Combining both creates a comprehensive visualization that shows both statistical summaries and actual data points.

Required Libraries

First, import the necessary libraries for data manipulation and visualization:

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

Creating Sample Data

Let's create sample cricket player data to demonstrate the visualization:

# Create sample cricket data
np.random.seed(42)

roles = ['Batsman'] * 15 + ['Bowler'] * 12 + ['All-rounder'] * 10
ages_batsman = np.random.normal(28, 4, 15).astype(int)
ages_bowler = np.random.normal(26, 3, 12).astype(int) 
ages_allrounder = np.random.normal(30, 5, 10).astype(int)

ages = np.concatenate([ages_batsman, ages_bowler, ages_allrounder])

# Create DataFrame
data = pd.DataFrame({
    'Role': roles,
    'Age': ages
})

print(data.head(10))
        Role  Age
0    Batsman   33
1    Batsman   27
2    Batsman   32
3    Batsman   36
4    Batsman   29
5    Batsman   27
6    Batsman   26
7    Batsman   30
8    Batsman   25
9    Batsman   26

Creating Box Plot with Swarm Plot Overlay

Plot the box plot first, then overlay the swarm plot with the same x and y parameters:

# Set up the plot
plt.figure(figsize=(10, 6))

# Create box plot
sns.boxplot(x='Role', y='Age', data=data, palette='Set2')

# Overlay swarm plot on top of box plot
sns.swarmplot(x='Role', y='Age', data=data, color='black', alpha=0.7, size=4)

# Customize the plot
plt.title('Age Distribution by Cricket Player Role', fontsize=16, fontweight='bold')
plt.xlabel('Player Role', fontsize=12)
plt.ylabel('Age (years)', fontsize=12)
plt.grid(True, alpha=0.3)

# Display the plot
plt.tight_layout()
plt.show()

Enhanced Visualization with Colors

Create a more visually appealing version with different colors for each role:

# Create enhanced visualization
plt.figure(figsize=(12, 7))

# Create box plot with palette
sns.boxplot(x='Role', y='Age', data=data, palette='viridis', alpha=0.7)

# Overlay swarm plot with matching colors
sns.swarmplot(x='Role', y='Age', data=data, palette='viridis', size=5, alpha=0.8)

# Customize appearance
plt.title('Cricket Player Age Distribution by Role\n(Box Plot with Individual Data Points)', 
          fontsize=16, fontweight='bold', pad=20)
plt.xlabel('Player Role', fontsize=14, fontweight='bold')
plt.ylabel('Age (years)', fontsize=14, fontweight='bold')

# Add grid for better readability
plt.grid(True, alpha=0.3, linestyle='--')

# Improve layout
plt.tight_layout()
plt.show()

Key Benefits of Combined Visualization

Visualization Information Provided Best For
Box Plot Quartiles, median, outliers Statistical summary
Swarm Plot Individual data points Data distribution pattern
Combined Both statistical summary and raw data Comprehensive analysis

Customization Options

You can customize various aspects of the combined plot:

# Customized version with different styling
plt.figure(figsize=(10, 6))

# Box plot with custom styling
box_plot = sns.boxplot(x='Role', y='Age', data=data, 
                       palette='pastel', 
                       boxprops=dict(alpha=0.7),
                       whiskerprops=dict(color='gray'),
                       capprops=dict(color='gray'),
                       medianprops=dict(color='red', linewidth=2))

# Swarm plot with custom styling
swarm_plot = sns.swarmplot(x='Role', y='Age', data=data, 
                          color='darkblue', 
                          size=3, 
                          alpha=0.6)

plt.title('Customized Box Plot with Swarm Plot Overlay')
plt.show()

Conclusion

Combining box plots with swarm plots provides both statistical summaries and individual data point visualization. Use sns.boxplot() first, then sns.swarmplot() with identical parameters to create effective overlaid visualizations for categorical data analysis.

Updated on: 2026-03-26T13:29:28+05:30

537 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements