What are heatmaps?

A heatmap is a graphical display of numerical data where color is used to denote values. In a data mining context, heatmaps are especially useful for two purposes − for visualizing correlation tables and for visualizing missing values in the data. In both cases, the information is conveyed in a two-dimensional table.

A heatmap is a graphical description of data that needs a system of color-coding to define multiple values. Heatmaps are used in various forms of analytics but are most commonly used to show user behavior on specific web pages or webpage templates. Heatmaps can be used to display where users have pressed on a page, how far they have scrolled down a page, or used to show the outcome of eye-tracking tests.

A correlation table for p variables has p rows and p columns. A data table includes p columns (variables) and n rows (observations). If the multiple rows are large, therefore a subset can be used. In both cases, it is simpler and faster to scan the color-coding instead of the values.

Heatmaps are helpful when determining a huge number of values, but they are not a restoration for a more precise graphical display, including bar charts, because color differences cannot be recognized precisely.

In a missing value heatmap, rows correlate to data and columns to variables. It needs a binary coding of the initial dataset where 1 indicates a missing value and 0 otherwise. This new binary table is colored including only missing value cells (with value 1) are colored.

The data involve economic, social, political, and “well-being” data on multiple countries around the globe (each row is a country). The variables were merged from multiple sources, and for each source, information was not always available in every country.

The missing data heatmap supports visualizing the level and amount of “missingness” in the combined data file. Some designs of “missingness” simply emerge variables that are missing for virtually all observations, and clusters of rows (countries) that are missing several values.

Variables with little missingness are also clear. This data can be used for deciding how to manage the missingness (e.g., dropping some variables, dropping some data, imputing, etc.).

Analytics tools such as Google Analytics or Site Catalyst are great at supporting metrics to display which pages users visit, but they can need detail when it appears to understand how users use those pages. Heatmaps can give a more comprehensive overview of how users are behaving.

Heatmaps are more visual than standard analytics reports, which can create them simpler to analyze at a glance. This makes them more accessible, particularly to people who are not accustomed to analyzing large amounts of data.

Good heat mapping tools, such as CrazyEgg, enable analysts to segment and filter the data. This means that it can be simple to view how multiple types of users are engaging with a specific page.