How to Color Scatterplot by a variable in Matplotlib?

Matplotlib allows you to color scatterplot points by a variable using several parameters in the scatter() function. The main parameters are c (for color mapping), cmap (colormap), and alpha (transparency).

Matplotlib is a powerful plotting library that extends NumPy's capabilities for data visualization. The pyplot module provides an easy interface for creating customized scatter plots with variable-based coloring.

Using a Colormap

A colormap maps continuous numerical values to a range of colors. Use the cmap parameter to specify the colormap and c to provide the values that determine each point's color.

Syntax

plt.scatter(x, y, c=values, cmap='colormap_name')

Where:

  • x, y ? Arrays of coordinates to plot

  • c ? Array of values to map to colors

  • cmap ? Name of the colormap (e.g., 'viridis', 'plasma', 'cool')

Example

Here's how to create a scatter plot where colors vary based on a third variable ?

import matplotlib.pyplot as plt
import numpy as np

x = np.array([20, 30, 40, 70, 50])
y = np.array([30, 20, 34, 56, 88])
colors = np.array([1, 2, 3, 4, 5])

plt.scatter(x, y, c=colors, cmap='viridis', s=100)
plt.colorbar(label='Color Scale')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Scatter Plot with Colormap')
plt.show()

Using Discrete Colors

You can assign specific colors to different categories by passing a list of color names or values to the c parameter.

Example

This example shows how to color points based on categorical data ?

import matplotlib.pyplot as plt
import numpy as np

x = np.array([1, 2, 3, 4, 5, 6])
y = np.array([2, 4, 1, 5, 3, 6])
categories = ['red', 'blue', 'green', 'red', 'blue', 'green']

plt.scatter(x, y, c=categories, s=100)
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Scatter Plot with Discrete Colors')
plt.show()

Using Alpha (Transparency)

The alpha parameter controls point transparency, where values range from 0 (transparent) to 1 (opaque). You can vary transparency based on a variable to add another dimension to your visualization.

Example

This example demonstrates variable transparency based on data values ?

import matplotlib.pyplot as plt
import numpy as np

x = np.random.rand(50)
y = np.random.rand(50)
transparency = np.random.rand(50)

plt.scatter(x, y, c='blue', alpha=transparency, s=100)
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Scatter Plot with Variable Transparency')
plt.show()

Combining Multiple Parameters

You can combine color mapping, size variation, and transparency for rich visualizations ?

import matplotlib.pyplot as plt
import numpy as np

# Generate sample data
n = 100
x = np.random.randn(n)
y = np.random.randn(n)
colors = np.random.rand(n)
sizes = 1000 * np.random.rand(n)
alpha_values = np.random.rand(n)

plt.scatter(x, y, c=colors, s=sizes, alpha=alpha_values, cmap='plasma')
plt.colorbar(label='Color Values')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Complex Scatter Plot')
plt.show()

Comparison of Methods

Parameter Use Case Data Type
cmap + c Continuous variables Numerical arrays
c (discrete) Categorical data Color names/codes
alpha Transparency variation Values 0?1

Conclusion

Use cmap with numerical data for continuous color mapping, discrete colors for categorical variables, and alpha for transparency effects. Combining these parameters creates rich, informative scatter plots that effectively communicate multiple data dimensions.

Updated on: 2026-03-27T11:38:53+05:30

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements