How to get the correlation between two columns in Pandas?

PythonServer Side ProgrammingProgramming

We can use the .corr() method to get the correlation between two columns in Pandas. Let's take an example and see how to apply this method.

Steps

  • Create a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.
  • Print the input DataFrame, df.
  • Initialize two variables, col1 and col2, and assign them the columns that you want to find the correlation of.
  • Find the correlation between col1 and col2 by using df[col1].corr(df[col2]) and save the correlation value in a variable, corr.
  • Print the correlation value, corr.

Example

import pandas as pd

df = pd.DataFrame(
   {
      "x": [5, 2, 7, 0],
      "y": [4, 7, 5, 1],
      "z": [9, 3, 5, 1]
   }
)
print "Input DataFrame is:\n", df

col1, col2 = "x", "y"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

col1, col2 = "x", "x"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

col1, col2 = "x", "z"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

col1, col2 = "y", "x"
corr = df[col1].corr(df[col2])
print "Correlation between ", col1, " and ", col2, "is: ", round(corr, 2)

Output

Input DataFrame is:
  x y z
0 5 4 9
1 2 7 3
2 7 5 5
3 0 1 1
Correlation between x and y is: 0.41
Correlation between x and x is: 1.0
Correlation between x and z is: 0.72
Correlation between y and x is: 0.41
raja
Published on 15-Sep-2021 06:38:06
Advertisements