Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Horizontal Stripplot with Jitter using Altair in Python
A horizontal stripplot with jitter is an effective visualization for displaying the distribution of continuous variables across different categories. Altair, a powerful Python library for declarative statistical visualization, makes creating these plots straightforward and customizable.
What are Stripplot and Jitter?
A stripplot displays individual data points in a horizontal arrangement, allowing us to observe their distribution across different categories. However, when multiple data points share the same horizontal position, they can overlap and make it difficult to distinguish individual points.
Jitter is a technique that adds a small amount of random noise to the vertical position of each point, spreading them out and reducing overlap for better visibility.
Prerequisites
Before starting, ensure you have Altair and Pandas installed in your Python environment ?
pip install altair pandas
Creating a Horizontal Stripplot with Jitter
Basic Implementation
Here's how to create a horizontal stripplot with jitter using the tips dataset ?
import altair as alt
import pandas as pd
# Load the tips dataset
tips = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv")
# Display first few rows to understand the data
print(tips.head())
total_bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4
Creating the Stripplot
Now let's create a horizontal stripplot with jitter to visualize total bill amounts by day of the week ?
import altair as alt
import pandas as pd
# Load the tips dataset
tips = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv")
# Create horizontal stripplot with jitter
chart = alt.Chart(tips).mark_circle(
size=60,
opacity=0.7
).encode(
x=alt.X('total_bill:Q', title='Total Bill ($)'),
y=alt.Y('day:O', title='Day of Week'),
color=alt.Color('sex:N', legend=alt.Legend(title='Gender')),
tooltip=['total_bill', 'day', 'sex', 'tip']
).properties(
title='Total Bill Distribution by Day of Week',
width=500,
height=250
)
# Display the chart
chart.show()
Adding Jitter
To add jitter effect that spreads overlapping points vertically, we can use Altair's yOffset encoding ?
import altair as alt
import pandas as pd
# Load the tips dataset
tips = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv")
# Create horizontal stripplot with jitter
chart = alt.Chart(tips).mark_circle(
size=50,
opacity=0.8
).encode(
x=alt.X('total_bill:Q', title='Total Bill ($)'),
y=alt.Y('day:O', title='Day of Week'),
yOffset=alt.Y('jitter:Q', scale=alt.Scale(range=[-20, 20])),
color=alt.Color('sex:N', legend=alt.Legend(title='Gender')),
tooltip=['total_bill', 'day', 'sex', 'tip']
).transform_calculate(
jitter='random()'
).properties(
title='Total Bill Distribution with Jitter',
width=500,
height=250
)
chart.show()
Customization Options
You can customize various aspects of your stripplot ?
import altair as alt
import pandas as pd
# Load the tips dataset
tips = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv")
# Customized stripplot
chart = alt.Chart(tips).mark_circle(
size=80,
opacity=0.6,
stroke='white',
strokeWidth=1
).encode(
x=alt.X('total_bill:Q',
title='Total Bill ($)',
scale=alt.Scale(domain=[0, 60])),
y=alt.Y('day:O',
title='Day of Week',
sort=['Thu', 'Fri', 'Sat', 'Sun']),
yOffset=alt.Y('jitter:Q', scale=alt.Scale(range=[-25, 25])),
color=alt.Color('time:N',
legend=alt.Legend(title='Meal Time'),
scale=alt.Scale(range=['#1f77b4', '#ff7f0e'])),
tooltip=['total_bill', 'day', 'time', 'tip', 'size']
).transform_calculate(
jitter='random()'
).properties(
title='Restaurant Bills by Day and Meal Time',
width=600,
height=300
).configure_axis(
labelFontSize=11,
titleFontSize=13
).configure_legend(
labelFontSize=11,
titleFontSize=12
)
chart.show()
Key Features
| Feature | Purpose | Altair Parameter |
|---|---|---|
| Jitter | Reduce point overlap |
yOffset with random values |
| Color Coding | Show additional categories |
color encoding |
| Tooltips | Interactive data exploration |
tooltip encoding |
| Opacity | Handle dense regions |
opacity in mark |
Conclusion
Horizontal stripplots with jitter in Altair provide an excellent way to visualize the distribution of continuous variables across categories. The jitter technique effectively reduces visual overlap, while Altair's declarative syntax makes customization intuitive and powerful.
