Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to make a grouped boxplot graph in matplotlib?
A grouped boxplot displays the distribution of a continuous variable across different categories, with each group subdivided by another categorical variable. In matplotlib, we can create grouped boxplots using Seaborn, which provides a high-level interface for statistical visualizations.
Basic Grouped Boxplot
Here's how to create a grouped boxplot using the tips dataset ?
import seaborn as sns
import matplotlib.pyplot as plt
# Set the figure size
plt.rcParams["figure.figsize"] = [7.00, 3.50]
plt.rcParams["figure.autolayout"] = True
# Load the tips dataset
data = sns.load_dataset('tips')
# Create a grouped boxplot
sns.boxplot(x='day', y='total_bill', hue='sex', data=data)
# Add title and labels
plt.title('Total Bill Distribution by Day and Gender')
plt.xlabel('Day of Week')
plt.ylabel('Total Bill ($)')
plt.show()
Customizing Grouped Boxplots
You can customize the appearance with different colors and styling ?
import seaborn as sns
import matplotlib.pyplot as plt
# Set style and color palette
sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = [8.00, 4.50]
# Load dataset
data = sns.load_dataset('tips')
# Create customized grouped boxplot
sns.boxplot(x='day', y='total_bill', hue='sex',
data=data, palette='Set2', linewidth=1.5)
# Customize the plot
plt.title('Restaurant Tips Analysis', fontsize=14, fontweight='bold')
plt.xlabel('Day of Week', fontsize=12)
plt.ylabel('Total Bill ($)', fontsize=12)
plt.legend(title='Gender', loc='upper right')
plt.tight_layout()
plt.show()
Multiple Grouped Variables
You can create more complex groupings by using different categorical variables ?
import seaborn as sns
import matplotlib.pyplot as plt
# Load dataset
data = sns.load_dataset('tips')
# Create subplots for multiple groupings
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# First plot: Day vs Total Bill grouped by Sex
sns.boxplot(x='day', y='total_bill', hue='sex', data=data, ax=ax1)
ax1.set_title('By Gender')
ax1.set_xlabel('Day')
ax1.set_ylabel('Total Bill ($)')
# Second plot: Day vs Total Bill grouped by Time
sns.boxplot(x='day', y='total_bill', hue='time', data=data, ax=ax2)
ax2.set_title('By Meal Time')
ax2.set_xlabel('Day')
ax2.set_ylabel('Total Bill ($)')
plt.tight_layout()
plt.show()
Key Parameters
Important parameters for sns.boxplot() include:
- x, y − Variables for x and y axes
- hue − Grouping variable for different colors
- data − DataFrame containing the data
- palette − Color palette for groups
- order, hue_order − Order of categories
Comparison of Approaches
| Method | Best For | Complexity |
|---|---|---|
| Seaborn boxplot | Quick statistical visualization | Low |
| Matplotlib boxplot | Custom control over appearance | High |
| Pandas groupby + plot | Data preprocessing integration | Medium |
Conclusion
Grouped boxplots effectively compare distributions across multiple categorical variables. Use sns.boxplot() with the hue parameter to create groups, and customize with palettes and styling for better visual appeal.
