Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Color Scatterplot by a variable in Matplotlib?
Matplotlib allows you to color scatterplot points by a variable using several parameters in the scatter() function. The main parameters are c (for color mapping), cmap (colormap), and alpha (transparency).
Matplotlib is a powerful plotting library that extends NumPy's capabilities for data visualization. The pyplot module provides an easy interface for creating customized scatter plots with variable-based coloring.
Using a Colormap
A colormap maps continuous numerical values to a range of colors. Use the cmap parameter to specify the colormap and c to provide the values that determine each point's color.
Syntax
plt.scatter(x, y, c=values, cmap='colormap_name')
Where:
x, y ? Arrays of coordinates to plot
c ? Array of values to map to colors
cmap ? Name of the colormap (e.g., 'viridis', 'plasma', 'cool')
Example
Here's how to create a scatter plot where colors vary based on a third variable ?
import matplotlib.pyplot as plt
import numpy as np
x = np.array([20, 30, 40, 70, 50])
y = np.array([30, 20, 34, 56, 88])
colors = np.array([1, 2, 3, 4, 5])
plt.scatter(x, y, c=colors, cmap='viridis', s=100)
plt.colorbar(label='Color Scale')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Scatter Plot with Colormap')
plt.show()
Using Discrete Colors
You can assign specific colors to different categories by passing a list of color names or values to the c parameter.
Example
This example shows how to color points based on categorical data ?
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4, 5, 6])
y = np.array([2, 4, 1, 5, 3, 6])
categories = ['red', 'blue', 'green', 'red', 'blue', 'green']
plt.scatter(x, y, c=categories, s=100)
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Scatter Plot with Discrete Colors')
plt.show()
Using Alpha (Transparency)
The alpha parameter controls point transparency, where values range from 0 (transparent) to 1 (opaque). You can vary transparency based on a variable to add another dimension to your visualization.
Example
This example demonstrates variable transparency based on data values ?
import matplotlib.pyplot as plt
import numpy as np
x = np.random.rand(50)
y = np.random.rand(50)
transparency = np.random.rand(50)
plt.scatter(x, y, c='blue', alpha=transparency, s=100)
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Scatter Plot with Variable Transparency')
plt.show()
Combining Multiple Parameters
You can combine color mapping, size variation, and transparency for rich visualizations ?
import matplotlib.pyplot as plt
import numpy as np
# Generate sample data
n = 100
x = np.random.randn(n)
y = np.random.randn(n)
colors = np.random.rand(n)
sizes = 1000 * np.random.rand(n)
alpha_values = np.random.rand(n)
plt.scatter(x, y, c=colors, s=sizes, alpha=alpha_values, cmap='plasma')
plt.colorbar(label='Color Values')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Complex Scatter Plot')
plt.show()
Comparison of Methods
| Parameter | Use Case | Data Type |
|---|---|---|
cmap + c
|
Continuous variables | Numerical arrays |
c (discrete) |
Categorical data | Color names/codes |
alpha |
Transparency variation | Values 0?1 |
Conclusion
Use cmap with numerical data for continuous color mapping, discrete colors for categorical variables, and alpha for transparency effects. Combining these parameters creates rich, informative scatter plots that effectively communicate multiple data dimensions.
