Python Pandas - Group the swarms by two categorical variables with Seaborn

Swarm Plot in Seaborn is used to draw a categorical scatterplot with non-overlapping points. The seaborn.swarmplot() function is used for this. To group the swarms by two categorical variables, set those variables in the swarmplot() using the x, y or hue parameters.

Sample Dataset

We'll create a sample cricket dataset to demonstrate grouping by two categorical variables ?

import seaborn as sb
import pandas as pd
import matplotlib.pyplot as plt

# Create sample cricket data
data = {
    'Role': ['Batsman', 'Batsman', 'Bowler', 'Bowler', 'All-rounder', 'All-rounder',
             'Batsman', 'Bowler', 'All-rounder', 'Batsman', 'Bowler', 'All-rounder'],
    'Matches': [45, 32, 28, 41, 38, 29, 52, 35, 44, 39, 31, 47],
    'Academy': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
}

dataFrame = pd.DataFrame(data)
print(dataFrame)
         Role  Matches Academy
0     Batsman       45       A
1     Batsman       32       B
2      Bowler       28       A
3      Bowler       41       B
4  All-rounder       38       A
5  All-rounder       29       B
6     Batsman       52       A
7      Bowler       35       B
8  All-rounder       44       A
9     Batsman       39       B
10     Bowler       31       A
11 All-rounder       47       B

Basic Swarm Plot with Two Categorical Variables

Use the x, y, and hue parameters to group by two categorical variables ?

import seaborn as sb
import pandas as pd
import matplotlib.pyplot as plt

# Create sample data
data = {
    'Role': ['Batsman', 'Batsman', 'Bowler', 'Bowler', 'All-rounder', 'All-rounder',
             'Batsman', 'Bowler', 'All-rounder', 'Batsman', 'Bowler', 'All-rounder'],
    'Matches': [45, 32, 28, 41, 38, 29, 52, 35, 44, 39, 31, 47],
    'Academy': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
}

dataFrame = pd.DataFrame(data)

# Set the theme
sb.set_theme(style="whitegrid")

# Create swarm plot grouped by Role (x-axis) and Academy (hue)
plt.figure(figsize=(8, 6))
sb.swarmplot(x="Role", y="Matches", hue="Academy", data=dataFrame)
plt.title("Cricket Matches by Player Role and Academy")
plt.show()

Customizing the Swarm Plot

You can customize colors, size, and other properties of the swarm plot ?

import seaborn as sb
import pandas as pd
import matplotlib.pyplot as plt

# Create sample data
data = {
    'Role': ['Batsman', 'Batsman', 'Bowler', 'Bowler', 'All-rounder', 'All-rounder',
             'Batsman', 'Bowler', 'All-rounder', 'Batsman', 'Bowler', 'All-rounder'],
    'Matches': [45, 32, 28, 41, 38, 29, 52, 35, 44, 39, 31, 47],
    'Academy': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
}

dataFrame = pd.DataFrame(data)

# Create customized swarm plot
plt.figure(figsize=(10, 6))
sb.swarmplot(x="Role", y="Matches", hue="Academy", data=dataFrame, 
             palette="Set2", size=8, alpha=0.8)

plt.title("Player Performance by Role and Academy")
plt.xlabel("Player Role")
plt.ylabel("Number of Matches")
plt.legend(title="Academy")
plt.show()

Alternative Grouping Approach

You can also swap the categorical variables to see different perspectives ?

import seaborn as sb
import pandas as pd
import matplotlib.pyplot as plt

# Create sample data
data = {
    'Role': ['Batsman', 'Batsman', 'Bowler', 'Bowler', 'All-rounder', 'All-rounder',
             'Batsman', 'Bowler', 'All-rounder', 'Batsman', 'Bowler', 'All-rounder'],
    'Matches': [45, 32, 28, 41, 38, 29, 52, 35, 44, 39, 31, 47],
    'Academy': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
}

dataFrame = pd.DataFrame(data)

# Group by Academy (x-axis) and Role (hue)
plt.figure(figsize=(8, 6))
sb.swarmplot(x="Academy", y="Matches", hue="Role", data=dataFrame)
plt.title("Matches by Academy and Player Role")
plt.show()

Key Parameters

Parameter Description Example
x First categorical variable "Role"
hue Second categorical variable "Academy"
palette Color scheme "Set1", "viridis"
size Point size 5, 8, 10

Conclusion

Seaborn's swarmplot() effectively groups data by two categorical variables using x and hue parameters. This creates clear visualizations showing distributions across multiple categories without overlapping points.

Updated on: 2026-03-26T13:23:45+05:30

488 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements