How to check if Time Series Data is Stationary with Python?

Python Server Side Programming Programming

Time series is a collection of data points, which are recorded at regular intervals of time. It is used to study the trend of patterns, relationship between the variable over the defined time. The common examples of time series are stock prices, weather patterns and economic indicators.

It analyzes the time series data by the statistical and mathematical techniques. The main aim of the time series is to identify the patterns and trends of the previous data to forecast the future values.

The data is said to be stationary, if it doesn’t change with the time. It is necessary to check if the data is stationary or not. There are different ways to check if time series data is stationary, let’s see them one by one.

Augmented Dickey-Fuller(ADF)

Augmented Dickey-Fuller(ADF) is a statistical test which checks for the presence of the unit roots available in the time series data. The unit root is the data which is non stationary. It returns the test static and p value as the output.

In the output, if the p value is below 0.05 that indicates the non-stationary time series data. The below is the example of the ADF stationary data. We have function in python namely, adfuller() which is available in the statsmodel package to check the time series data is stationary.

Example

In this example we are finding the ADF statistic and p-value of the Augmented Dickey Fuller using the adfuller() function of the statsmodel package of python.

from statsmodels.tsa.stattools import adfuller
import pandas as pd
data = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',parse_dates=['date'], index_col='date')
t_data = data.loc[:, 'value'].values
result = adfuller(t_data)
print("The result of adfuller function:",result)
print('ADF Statistic:', result[0])
print('p-value:', result[1])

Output

Following is the output produced after executing the program above –

The result of adfuller function: (3.145185689306744, 1.0, 15, 188, {'1%': -3.465620397124192, '5%': -2.8770397560752436, '10%': -2.5750324547306476}, 549.6705685364172)
ADF Statistic: 3.145185689306744
p-value: 1.0

KPSS Test

The other test for checking the unit roots is the KPSS test. It is abbreviated as Kwiatkowski-Phillips-Schmidt-Shin. We have a function named kpss() in the statsmodels package which is used to check for the unit roots in the time series data.

Example

The below is an example to find the unit roots in the time series data.

from statsmodels.tsa.stattools import kpss
import pandas as pd
data = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',parse_dates=['date'], index_col='date')
t_data = data.loc[:, 'value'].values
from statsmodels.tsa.stattools import kpss
result = kpss(data)
print("The result of kpss function:",result)
print('KPSS Statistic:', result[0])
print('p-value:', result[1])

Output

The following is the output of the kpss() function of the statsmodels package.

The result of kpss function: (2.0131256386303322, 0.01, 9, {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739})
KPSS Statistic: 2.0131256386303322
p-value: 0.01

Rolling statistics

The other way to check the stationarity of the time series data is by plotting the moving average and moving standard deviation of the given time series data and has to check if the data remain constant. In the plot if the data vary over time then the time series data is non stationary.

Example

The following is the example for checking the data variation by plotting the moving average and moving standard deviation using the matplotlib library plot() function.

import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/a10.csv',parse_dates=['date'], index_col='date')
t_data = data.loc[:, 'value'].values
moving_avg = t_data.mean()
moving_std = t_data.std()
plt.plot(data, color='green', label='Original')
plt.plot(moving_avg, color='red', label='moving average')
plt.plot(moving_std, color='black', label='moving Standard deviation')
plt.legend(loc='best')
plt.title('Moving Average & Moving Standard Deviation')
plt.show()

Output

The following is the output of the standardization of the time series data by plotting the moving average and moving standard Deviation.

Niharika Aitam

Updated on: 09-Aug-2023

137 Views

Kickstart Your Career

Get certified by completing the course

Get Started