Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Write a program in Python to compute autocorrelation between series and number of lags
Autocorrelation measures the correlation between a time series and its lagged version. In pandas, the autocorr() method computes the Pearson correlation coefficient between a series and its lagged values.
Understanding Autocorrelation
Autocorrelation helps identify patterns and dependencies in time series data. A lag of 1 compares each value with the previous value, lag of 2 compares with the value two positions back, and so on.
Creating a Series
Let's create a pandas Series with some sample data including a NaN value ?
import pandas as pd
import numpy as np
series = pd.Series([2, 10, 3, 4, 9, 10, 2, np.nan, 3])
print("Series is:")
print(series)
Series is: 0 2.0 1 10.0 2 3.0 3 4.0 4 9.0 5 10.0 6 2.0 7 NaN 8 3.0 dtype: float64
Computing Autocorrelation with Default Lag
The default lag is 1, which compares each value with the previous value ?
import pandas as pd
import numpy as np
series = pd.Series([2, 10, 3, 4, 9, 10, 2, np.nan, 3])
autocorr_lag1 = series.autocorr()
print("Autocorrelation with lag=1:")
print(autocorr_lag1)
Autocorrelation with lag=1: -0.4711538461538461
Computing Autocorrelation with Specific Lag
You can specify any lag value. Here we use lag=2 to compare values with those two positions earlier ?
import pandas as pd
import numpy as np
series = pd.Series([2, 10, 3, 4, 9, 10, 2, np.nan, 3])
autocorr_lag2 = series.autocorr(lag=2)
print("Autocorrelation with lag=2:")
print(autocorr_lag2)
Autocorrelation with lag=2: -0.2933396642805515
Complete Example
Here's a complete program that demonstrates autocorrelation with different lag values ?
import pandas as pd
import numpy as np
# Create a sample series
series = pd.Series([2, 10, 3, 4, 9, 10, 2, np.nan, 3])
print("Series is:")
print(series)
print("\nAutocorrelation with lag=1:")
print(series.autocorr())
print("\nAutocorrelation with lag=2:")
print(series.autocorr(lag=2))
print("\nAutocorrelation with lag=3:")
print(series.autocorr(lag=3))
Series is: 0 2.0 1 10.0 2 3.0 3 4.0 4 9.0 5 10.0 6 2.0 7 NaN 8 3.0 dtype: float64 Autocorrelation with lag=1: -0.4711538461538461 Autocorrelation with lag=2: -0.2933396642805515 Autocorrelation with lag=3: -0.1473321235353893
Key Points
Autocorrelation values range from −1 to 1
Positive values indicate positive correlation with lagged values
Negative values indicate negative correlation with lagged values
NaN values are automatically handled and excluded from calculations
Conclusion
The autocorr() method in pandas efficiently computes autocorrelation for time series analysis. Use different lag values to identify patterns and dependencies at various time intervals in your data.
