How to get the correlation between two columns in Pandas?

We can use the .corr() method to get the correlation between two columns in Pandas. The correlation coefficient measures the linear relationship between two variables, ranging from -1 to 1.

Basic Syntax

# Method 1: Using .corr() on a Series
correlation = df['column1'].corr(df['column2'])

# Method 2: Using .corr() on DataFrame to get correlation matrix
correlation_matrix = df[['column1', 'column2']].corr()

Example

Let's create a DataFrame and calculate correlations between different columns ?

import pandas as pd

# Create sample DataFrame
df = pd.DataFrame({
    "x": [5, 2, 7, 0],
    "y": [4, 7, 5, 1],
    "z": [9, 3, 5, 1]
})

print("Input DataFrame:")
print(df)
Input DataFrame:
   x  y  z
0  5  4  9
1  2  7  3
2  7  5  5
3  0  1  1

Finding Correlation Between Two Columns

import pandas as pd

df = pd.DataFrame({
    "x": [5, 2, 7, 0],
    "y": [4, 7, 5, 1],
    "z": [9, 3, 5, 1]
})

# Correlation between x and y
corr_xy = df['x'].corr(df['y'])
print(f"Correlation between x and y: {corr_xy:.2f}")

# Correlation between x and z
corr_xz = df['x'].corr(df['z'])
print(f"Correlation between x and z: {corr_xz:.2f}")

# Self-correlation (always 1.0)
corr_xx = df['x'].corr(df['x'])
print(f"Correlation between x and x: {corr_xx:.2f}")
Correlation between x and y: 0.41
Correlation between x and z: 0.72
Correlation between x and x: 1.00

Getting Correlation Matrix

You can also get the correlation matrix for multiple columns at once ?

import pandas as pd

df = pd.DataFrame({
    "x": [5, 2, 7, 0],
    "y": [4, 7, 5, 1],
    "z": [9, 3, 5, 1]
})

# Get correlation matrix for all columns
correlation_matrix = df.corr()
print("Correlation Matrix:")
print(correlation_matrix)

# Get correlation matrix for specific columns
specific_corr = df[['x', 'y']].corr()
print("\nCorrelation between x and y columns:")
print(specific_corr)
Correlation Matrix:
          x         y         z
x  1.000000  0.409836  0.722071
y  0.409836  1.000000  0.075107
z  0.722071  0.075107  1.000000

Correlation between x and y columns:
          x         y
x  1.000000  0.409836
y  0.409836  1.000000

Understanding Correlation Values

Correlation Value Relationship Meaning
1.0 Perfect Positive Variables increase together
0.0 No Linear Relationship No correlation
-1.0 Perfect Negative One increases as other decreases

Conclusion

Use df['col1'].corr(df['col2']) to get correlation between two specific columns. Use df.corr() to get the complete correlation matrix for all numeric columns in your DataFrame.

Updated on: 2026-03-26T01:57:49+05:30

33K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements