Matplotlib - Scales



What are Scales in Matplotlib?

In Matplotlib library scales refer to the mapping of data values to the physical dimensions of a plot. They determine how data values are represented and visualized along the axes of a plot. Matplotlib supports various types of scales and the choice of scale can significantly impact how the data is perceived in visualization.

The below are the common types of scales available in matplotlib library.

  • Linear Scale − Suitable for most numerical data without large variations in magnitude.
  • Logarithmic Scale − Ideal for datasets covering several orders of magnitude or exhibiting exponential growth.
  • Symmetrical Logarithmic Scale − Suitable for datasets with both positive and negative values.

Let us go through these one by one.

Linear Scale

The linear scale is the default scale used to represent data along axes in a plot. It's a straightforward mapping where the data values are plotted in direct proportion to their actual numerical values. In a linear scale equal distances along the axis represent equal differences in the data.

Characteristics of Linear Scale

  • Equal Intervals − In a linear scale equal distances on the axis correspond to equal differences in data values.
  • Linear Mapping − The relationship between data values and their position on the axis is linear.

Using Linear Scale

By default the Matplotlib library uses a linear scale for both the x-axis and y-axis. To explicitly set a linear scale we don't need to use any specific function as it's the default behavior. However we can specify it explicitly using plt.xscale('linear') or plt.yscale('linear') for the x-axis or y-axis respectively.

The following is the example of applying the linear scale to a plot.

Example

import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
plt.plot(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Linear Scale')
plt.show()
Output

Following is the output of the above program −

Linear Scale

When to Use Linear Scale

  • Linear scales are commonly used when the data doesn't have exponential growth or when the range of values isn't too large.
  • It's suitable for representing most numerical data that doesn't exhibit significant nonlinear behavior.

Logarithmic Scale

The Logarithmic scale represents data using a logarithmic mapping. This is useful when there is a wide range of values and the logarithmic scale helps to emphasize changes in smaller values.

Characteristics of Logarithmic Scale

The below are the characteristics of the logarithmic scale −

  • Equal Ratios − In a logarithmic scale, equal distances on the axis represent equal ratios between values rather than equal differences.
  • Compression of Data − It compresses a wide range of data into a more readable and interpretable visualization.
  • Emphasizes Smaller Values − It emphasizes changes in smaller values more than larger ones.

Using Logarithmic Scale

To use a logarithmic scale we have to specify plt.xscale('log') or plt.yscale('log') for the x-axis or y-axis respectively. Logarithmic scales are particularly useful for visualizing exponential growth or phenomena that cover several orders of magnitude.

When to Use Logarithmic Scale

Logarithmic scales are suitable for data with large variations in magnitude or when there's a need to highlight changes in smaller values. Commonly used in fields like finance (stock prices), scientific research (decibel levels, earthquake magnitudes) and biology (pH levels).

Example

The following is the example plot with the logarithmic scale −

import matplotlib.pyplot as plt
import numpy as np

# Generating logarithmically spaced data
x = np.linspace(1, 10, 100)
y = np.log(x)

# Creating a plot with a logarithmic scale for the x-axis
plt.plot(x, y)
plt.xscale('log')  # Set logarithmic scale for the x-axis
plt.xlabel('X-axis (log scale)')
plt.ylabel('Y-axis')
plt.title('Logarithmic Scale')
plt.show()

Output

Following is the output of the above program −

Logarithmic Scale

Using a logarithmic scale in a plot can provide insights into data with a wide range of values making it easier to visualize patterns and trends across different scales within the same plot.

Logarithmic plot of a cumulative distribution function

The example given below, shows the Logarithmic plot of a cumulative distribution function.

Example

import numpy as np
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = [7.50, 3.50]
plt.rcParams["figure.autolayout"] = True
N = 100
data = np.random.randn(N)
X2 = np.sort(data)
F2 = np.array(range(N))/float(N)
plt.plot(X2, F2)
plt.xscale('log')
plt.yscale('log')
plt.show()

Output

Following is the output of the above program −

cummulative_log

Symmetrical Logarithmic Scale

The Symmetrical Logarithmic scale is similar to the logarithmic scale. It often abbreviated as symlog which is a type of scale used to represent data on an axis where the values are distributed symmetrically around zero using logarithmic intervals. It provides a logarithmic-like scale for both positive and negative values while accommodating zero.

To apply the Symmetrical Logarithmic scale on x-axis and y-axis, we have to use plt.xscale(‘symlog’) and plt.yscale(‘symlog’) respectively.

Characteristics of Symmetrical Logarithmic Scale

The symmetrical logarithmic scale has the following characteristics.

  • Symmetrical Behaviour − Represents both positive and negative values logarithmically while handling zero.
  • Linear Near Zero − The scale is linear around zero within a specified range (linthresh) before transitioning to logarithmic behaviour.

Parameters for Symmetrical Logarithmic Scale

linthresh − Linear threshold that determines the range around zero where the scale behaves linearly before transitioning to a logarithmic scale.

When to Use Symmetrical Logarithmic Scale:

  • Data around Zero − Suitable for datasets containing values centered around zero with a wide range of positive and negative values.
  • Avoiding Symmetry Bias − When symmetric representation of positive and negative values is needed without bias towards either side.

Importance of Symmetrical Logarithmic Scale

The Symmetrical Logarithmic Scale provides a logarithmic-like scale that accommodates both positive and negative values, making it useful for visualizing datasets with a balanced distribution around zero.

It also helps in highlighting smaller variations around zero while accommodating larger values without skewing the representation.

Example

In this plot we are creating the symmetrical Logarithmic Scale on the y-axis by using the plt.yscale('symlog', linthresh=0.01).

import matplotlib.pyplot as plt
import numpy as np

# Generating data for a sine wave with values around zero
x = np.linspace(-10, 10, 500)
y = np.sin(x)

# Creating a plot with a symmetrical logarithmic scale for the y-axis
plt.plot(x, y)

# Set symmetrical logarithmic scale for the y-axis
plt.yscale('symlog', linthresh=0.01)  
plt.xlabel('X-axis')
plt.ylabel('Y-axis (symlog scale)')
plt.title('Symmetrical Logarithmic Scale')
plt.show()

Output

Following is the output of the above program −

Symmetric Log

Using a symmetrical logarithmic scale in Matplotlib allows for the visualization of datasets containing values around zero by enabling effective representation and analysis of symmetrically distributed data. Adjusting the linear threshold (linthresh) parameter is crucial to determine the range where the scale behaves linearly around zero before transitioning to a logarithmic scale.

Logit Scale

The Logit scale is a specialized type of scale used to represent data on an axis where the values are confined between 0 and 1. It's specifically designed for data that exists within this range commonly encountered in probabilities or values representing probabilities.

Setting the Scale

The plt.xscale() and plt.yscale() functions can be used to set the scale for the x-axis and y-axis respectively.

Characteristics of Logit Scale

The below are the characteristics of Logit Scale.

  • Constrains Data − Specifically used for data bounded between 0 and 1.
  • Transformation − Utilizes the logit function to map values from the standard logistic distribution.

When to Use Logit Scale

  • Probability Data − Suitable for visualizing probabilities or values representing probabilities, especially when dealing with logistic regression or logistic models.
  • Data within 0 to 1 Range − Specifically designed for data bounded within the 0 to 1 interval.

Importance of Logit Scale

  • The Logit Scale facilitates the visualization and analysis of data that represents probabilities or has a probabilistic interpretation.
  • It also helps in understanding and visualizing transformations of probability-related data.

Example 1

In this plot we are creating the Logit scale on x-axis and y-axis.

import matplotlib.pyplot as plt
import numpy as np

# Generating data within the 0 to 1 range
x = np.linspace(0.001, 0.999, 100)
y = np.log(x / (1 - x))

# Creating a plot with a logit scale for the x-axis
plt.plot(x, y)
plt.xscale('logit')  # Set logit scale for the x-axis
plt.xlabel('X-axis (logit scale)')
plt.ylabel('Y-axis')
plt.title('Logit Scale')
plt.show()

Output

Following is the output of the above program −

Logit Scale

Understanding and choosing the appropriate scale for a plot is important for accurately representing the underlying data and ensuring that patterns and trends are effectively communicated in visualizations.

Example 2

In this plot we are plotting the yscale class linear, log, logit and symlog by name.

import numpy as np
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = [7.50, 3.50]
plt.rcParams["figure.autolayout"] = True
y = np.random.normal(loc=0.5, scale=0.4, size=1000)
y = y[(y > 0) & (y < 1)]
y.sort()
x = np.arange(len(y))

# linear
plt.subplot(221)
plt.plot(x, y)
plt.yscale('linear')
plt.title('linear')

# log
plt.subplot(222)
plt.plot(x, y)
plt.yscale('log')
plt.title('log')

# symmetric log
plt.subplot(223)
plt.plot(x, y - y.mean())
plt.yscale('symlog', linthresh=0.01)
plt.title('symlog')

# logit
plt.subplot(224)
plt.plot(x, y)
plt.yscale('logit')
plt.title('logit')
plt.show()

Output

Following is the output of the above program −

Logit Scale
Advertisements