Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python Pandas - Draw a violin plot and set quartiles as horizontal lines with Seaborn
A violin plot combines a box plot and kernel density estimation to show the distribution of data. In Seaborn, you can draw violin plots with quartiles displayed as horizontal lines using the inner="quartile" parameter.
What is a Violin Plot?
A violin plot displays the probability density of data at different values, similar to a box plot but with a rotated kernel density plot on each side. The quartiles help identify the median and interquartile range within the distribution.
Basic Violin Plot with Sample Data
Let's create a violin plot using sample data to demonstrate the quartile lines ?
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Create sample data
np.random.seed(42)
data = {
'Role': ['Batsman'] * 50 + ['Bowler'] * 50,
'Age': np.concatenate([
np.random.normal(28, 4, 50), # Batsman ages
np.random.normal(26, 3, 50) # Bowler ages
])
}
df = pd.DataFrame(data)
print(df.head())
Role Age
0 Batsman 30.967309
1 Batsman 26.793585
2 Batsman 33.297968
3 Batsman 29.613614
4 Batsman 29.949619
Creating Violin Plot with Quartiles
Use the inner="quartile" parameter to display quartile lines horizontally across the violin ?
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Create sample data
np.random.seed(42)
data = {
'Role': ['Batsman'] * 50 + ['Bowler'] * 50,
'Age': np.concatenate([
np.random.normal(28, 4, 50), # Batsman ages
np.random.normal(26, 3, 50) # Bowler ages
])
}
df = pd.DataFrame(data)
# Create violin plot with quartiles
plt.figure(figsize=(8, 6))
sns.violinplot(x='Role', y='Age', data=df, inner="quartile", order=["Batsman", "Bowler"])
plt.title('Age Distribution by Role with Quartiles')
plt.show()
[A violin plot showing age distribution for Batsman and Bowler roles with horizontal quartile lines]
Parameters Explanation
| Parameter | Description | Values |
|---|---|---|
x, y |
Variables for categorical and continuous axes | Column names |
data |
DataFrame containing the data | pandas DataFrame |
inner |
Representation inside the violin | "quartile", "box", "point", "stick" |
order |
Order of categorical levels | List of category names |
Different Inner Representations
Compare different inner parameter options to see various ways of displaying data within violins ?
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Create sample data
np.random.seed(42)
data = {
'Role': ['Batsman'] * 30 + ['Bowler'] * 30,
'Age': np.concatenate([
np.random.normal(28, 4, 30),
np.random.normal(26, 3, 30)
])
}
df = pd.DataFrame(data)
# Create subplots for different inner styles
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
fig.suptitle('Violin Plots with Different Inner Representations')
inner_styles = ["quartile", "box", "point", "stick"]
for i, style in enumerate(inner_styles):
ax = axes[i//2, i%2]
sns.violinplot(x='Role', y='Age', data=df, inner=style, ax=ax)
ax.set_title(f'inner="{style}"')
plt.tight_layout()
plt.show()
[Four violin plots showing different inner representations: quartile lines, box plots, individual points, and stick representations]
Conclusion
Violin plots with inner="quartile" effectively show both data distribution and quartile statistics. Use the order parameter to control category sequence and combine with other Seaborn styling options for better visualization.
