How to get the correlation between two columns in Pandas?


We can use the .corr() method to get the correlation between two columns in Pandas. Let's take an example and see how to apply this method.

Steps

  • Create a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.
  • Print the input DataFrame, df.
  • Initialize two variables, col1 and col2, and assign them the columns that you want to find the correlation of.
  • Find the correlation between col1 and col2 by using df[col1].corr(df[col2]) and save the correlation value in a variable, corr.
  • Print the correlation value, corr.

Example

import pandas as pd

df = pd.DataFrame(
   {
      "x": [5, 2, 7, 0],
      "y": [4, 7, 5, 1],
      "z": [9, 3, 5, 1]
   }
)
print "Input DataFrame is:\n", df

col1, col2 = "x", "y"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

col1, col2 = "x", "x"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

col1, col2 = "x", "z"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

col1, col2 = "y", "x"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

Output

Input DataFrame is:
  x y z
0 5 4 9
1 2 7 3
2 7 5 5
3 0 1 1
Correlation between x and y is: 0.41
Correlation between x and x is: 1.0
Correlation between x and z is: 0.72
Correlation between y and x is: 0.41

Updated on: 12-Sep-2023

29K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements