Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to make Density Plot in Python with Altair?
Altair is a statistical visualization library in Python based on Vega-Lite grammar. Density plots are useful for visualizing data distribution, comparing groups, and detecting outliers. This article demonstrates how to create density plots using Altair with a practical example.
What is a Density Plot?
A density plot shows the distribution of a continuous variable by estimating the probability density function. It's similar to a histogram but uses a smooth curve instead of bars.
Required Libraries
First, let's import the necessary libraries ?
import altair as alt import pandas as pd
Loading Sample Data
We'll use the Titanic dataset to create a density plot showing age distribution by survival status ?
import altair as alt
import pandas as pd
# Load Titanic dataset
titanic_data = pd.read_csv("https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv")
# Display first few rows
print(titanic_data.head())
Survived Pclass Name Sex Age Siblings/Spouses Aboard Parents/Children Aboard Fare 0 0 3 Braund, Mr. Owen Harris male 22.0 1 0 7.2500 1 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 71.2833 2 1 3 Heikkinen, Miss. Laina female 26.0 0 0 7.9250 3 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 53.1000 4 0 3 Allen, Mr. William Henry male 35.0 0 0 8.0500
Creating a Density Plot
Now let's create a density plot showing age distribution grouped by survival status ?
import altair as alt
import pandas as pd
# Load data
titanic_data = pd.read_csv("https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv")
# Create density plot
chart = alt.Chart(titanic_data).transform_density(
density='Age',
as_=['Age', 'density'],
groupby=['Survived'],
extent=[0, 100]
).mark_area(
opacity=0.6
).encode(
x=alt.X('Age:Q', title='Age'),
y=alt.Y('density:Q', title='Density'),
color=alt.Color('Survived:N',
scale=alt.Scale(domain=[0, 1], range=['red', 'blue']),
title='Survived'),
tooltip=['Survived:N', 'Age:Q', 'density:Q']
).properties(
title='Age Distribution by Survival Status - Titanic Dataset',
width=600,
height=300
)
chart
Understanding the Code Components
transform_density() Parameters
The transform_density() method computes kernel density estimation with these key parameters ?
density: Variable for density computation (here 'Age')
as_: Output variable names ['Age', 'density']
groupby: Variables to group by ['Survived']
extent: Range for density computation [0, 100]
Encoding Properties
The encoding maps data to visual properties ?
x-axis: Age values (quantitative)
y-axis: Density values
color: Survival status (red for died, blue for survived)
tooltip: Interactive information on hover
Customizing the Plot
You can customize colors, transparency, and styling ?
import altair as alt
import pandas as pd
titanic_data = pd.read_csv("https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv")
# Custom styled density plot
chart = alt.Chart(titanic_data).transform_density(
density='Age',
as_=['Age', 'density'],
groupby=['Survived']
).mark_area(
opacity=0.4,
stroke='white',
strokeWidth=2
).encode(
x=alt.X('Age:Q', title='Age (years)', scale=alt.Scale(domain=[0, 80])),
y=alt.Y('density:Q', title='Probability Density'),
color=alt.Color('Survived:N',
scale=alt.Scale(domain=[0, 1], range=['#ff6b6b', '#4ecdc4']),
legend=alt.Legend(title="Survival Status",
labelExpr="datum.value == 1 ? 'Survived' : 'Died'"))
).properties(
title=alt.TitleParams(text='Titanic Passenger Age Distribution', fontSize=16),
width=500,
height=250
)
chart
Key Features
| Feature | Description | Benefit |
|---|---|---|
| Smooth curves | Continuous density estimation | Better than histograms for smooth distributions |
| Group comparison | Multiple distributions overlaid | Easy visual comparison between groups |
| Interactive tooltips | Hover for exact values | Detailed data exploration |
Conclusion
Altair makes creating density plots straightforward with transform_density(). These plots effectively visualize continuous data distributions and enable easy comparison between different groups in your dataset.
