How to make Density Plot in Python with Altair?


Altair is a kind of statistical visualization library in Python. This library is based on vega and vega-lite javascript libraries. Moreover, we can visualize the data distribution in the form of a density plot in Python. These plots are useful whether we need to compare the variable distribution across the different groups or we need to visualize the distribution shape. The useful applications of density plots such as visualizing data shape, outlier detection, comparing distribution and model selection. The density plot is demonstrated in this article with the help of Python code. For better understanding, we are considering an example which is given below with all the steps.

Example 1: Make a density plot using Altair library in Python.

Finding the current location of the user and showing the location coordinates on the HTML page.

Code explanation with the necessary steps

Step 1: First of all, import the necessary libraries:

import altair as altr
import pandas as pd

Step 2: Load the data

In this case, ‘titanic_data’ is used which contains all the information about passengers of the Titanic ship.

Step 3: Create a chart object

Chart = altr.chart(titanic_data)

Step 4: Transform the data.

The density estimation of variables is computed using ‘transform_density’ method.

Step 5: Add a demarcation of the area. In creating an area plot, we are using the ‘mark_area’ method.

Step 6: Encode the x-axis with all the variables which you need to plot. Variables are to be mapped to the x-axis using the ‘encode’ method.

Step 7: ‘Encode’ the y-axis with the density which you need to plot. Density variables are to be mapped to the x-axis using the ‘encode’ method.

Step 8:‘Encode’ the color variables which you need to in the plot. Variables to the color of the plot are to be mapped using encode method and scale argument maps with colors 0 and 1.

Step 9: Include tooltip information whereas Tooltip is a pop-up box, it appears when users hover over data points on a chart or graph.

Step 10: Set the properties of the chart. The ‘Properties’ method is to be used for setting the title and size of the chart.

Step 11: Display chart. To demonstrate the chart, the display method is used here.

Example Code in Python using Jupyter Notebook

The tested code is given below, it generates a density plot of Titanic travellers’ age distribution, grouped by their survival status using by the Altair in Python.

Example

import altair as altr
import panda as pd 
titanic_data = pd.read_csv("https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/stuff/titanic.csv")
# Generate the density plot 
chart = altr.Chart(titanic_data).transform_density( 
             density='Age', 
             as_=['Age', 'density'], 
             groupby=['Survived'], 
             extent=[0, 100] 
).mark_area(orient='horizontal', opacity=0.5).encode(    
            x=altr.X('Age:Q', title='Age'), 
            y=altr.Y('density:Q', title='Density'), 
            color=altr.Color('Survived:N', scale=altr.Scale(domain=[0, 1], range=['red', 'blue']),   
           title='Survived'),
          tooltip=['Survived:N', 'Age:Q', 'density:Q']
 ).properties(title='Age Density Plot of Titanic Travellers', width=600, height=300) 
# Render the chart
 chart.display()
chart.show()

The main functions of different methods which are going to describe here, have been used in this code.

‘transform_density’: The ‘transform_density’ method is being used here to compute the density estimation of both variables such as ‘age’ and ‘survived’ variables.

There are several arguments have been taken by transform_density such as:

  • ‘density’ is the variable where we can compute the density estimation. Here, it is ‘Age’.

  • ‘as_’ is a list of two strings which specify the output variable names (Age,density).

  • ‘groupby’ is a list of two strings which specify the group of data (‘Survived’).

  • ‘extent’ has a list of values that depicts the range of values for the density plot. The range is [0,100] in other words, this density will be computed the age from 0 to 100 years old.

After computing the density of ‘Age’ for every value of ‘survived’, can compare the value of age distributions of two groups and observe the noticeable difference if it is there. The final resulting density plot illustrates the value of the age of two groups on the x-axis and y-axis as well as different plots for individual groups of travelers.

Viewing The Result - Example

For seeing the result open the Vega editor in a browser. You can see the density plot of Titanic travelers. If you want to know the value of a particular location, now click on a specific location, can see the value in a pop-up box.

Fig.2 Density Plot of Titanic Traveller data using Altair Library in Python

Fig.3 Shows the value of a particular point in the Density Plot of the Titanic Traveller

In this Altair library article, using online data of Titanic Travellers we make a density chart. Data distribution can be visualized in density plots in Python.

Updated on: 28-Aug-2023

153 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements