Python Pandas - Lag Plot



Lag plot is a plotting technique used to determine whether a given dataset or time series is random. Random datasets will not display any structure in the lag plot, while non-random datasets may reveals patterns or structures in time series data. It is a type of scatter plot where each point is represented by two consecutive data points in a time series.

A lag plot represents −

  • The x-axis represents y(t), the value of the data at time t.

  • The y-axis represents y(t+lag), the value at time t+lag.

In this tutorial, we will learn how to use Python's Pandas library to create log plots with different examples to visualize time series data.

Log Plot in Pandas

Pandas provides a direct function called log_plot() function within the plotting module for generating log plot for time series data. This function takes a Series with time data, and other parameters for customization. This function returns a matplotlib.axes.Axes object representing the plot.

Syntax

Following is the syntax of the lag_plot() function −

pandas.plotting.lag_plot(series, lag=1, ax=None, **kwds)

Where,

  • series: The time series to visualize. It must be a Pandas Series object.

  • lag: It is an optional parameter determines the lag length for the scatter plot.

  • ax: It is also an optional parameter specifies the Matplotlib axis object to draw the plot on.

  • **kwds: Additional Matplotlib scatter method keyword arguments for customization.

Creating Lag Plot in Pandas

You can create the lag plot in pandas by using the pandas.plotting.lag_plot() function for the time series data.

Example

The following example demonstrates creating a lag plot for a time series generated using random numbers.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas.plotting import lag_plot

plt.rcParams["figure.figsize"] = [7, 4]

# Generate a random time series
np.random.seed(5)
x = np.cumsum(np.random.normal(loc=1, scale=5, size=100))
ts = pd.Series(x)

# Generate a lag plot
lag_plot(ts)
plt.title("Basic Lag Plot")
plt.show()

Following is the output of the above code −

Basic Lag Plot

Lag Plot with Cyclic Data

Lag plots are particularly effective for identifying patterns in periodic or cyclic datasets. If the given data is a time series, the lag plot will display the shapes.

Example

Following example creates lag plot for the time series using the plotting.lag_plot() function.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas.plotting import lag_plot

plt.rcParams["figure.figsize"] = [7, 4]

# Create a sine wave with random noise
angles = np.linspace(-10 * np.pi, 10 * np.pi, 500)
s = pd.Series(0.1 * np.random.rand(500) + np.sin(angles))

# Lag plot to visualize the pattern
lag_plot(s, lag=2)
plt.title("Lag Plot for Periodic Data")
plt.show()

Following is the output of the above code −

Lag Plot for Periodic Data

Customizing Lag Plots

Lag plots can be customized using additional Matplotlib scatter function keyword **kwds arguments. You can adjust properties like color, marker type, size, and transparency to enhance the visualization.

Example

Following example demonstrates customizing the lag plot using the additional keyword arguments of the plotting.lag_plot() function.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pandas.plotting import lag_plot

plt.rcParams["figure.figsize"] = [7, 4]

# Create a sine wave with random noise
angles = np.linspace(-10 * np.pi, 10 * np.pi, 500)
s = pd.Series(0.1 * np.random.rand(500) + np.sin(angles))

# Lag plot to visualize the pattern
lag_plot(s, lag=2, c='red', marker='o', alpha=0.7)
plt.title("Customized Lag Plot")
plt.show()

Following is the output of the above code −

Customized Lag Plot
Advertisements