How to find the correlation coefficient between rows of two data frames in R?


It is common the find the correlation coefficient between columns of an R data frame but we might want to find the correlation coefficient between rows of two data frames. This might be needed in situations where we expect that there exists some relationship row of an R data frame with row of another data frame. For example, row of an R data frame showing buying trend of a customer in one year and the same row of the other data frame showing buying trend of the same customer in another year.

Consider the below data frame −

Example

 Live Demo

x1<-sample(0:100,20)
x2<-sample(0:100,20)
x3<-sample(0:100,20)
df1<-data.frame(x1,x2,x3)
df1

Output

  x1 x2 x3
1 56 61 23
2 87 89 60
3 26 38 5
4 92 23 81
5 43 34 51
6 54 39 55
7 20 1 40
8 38 35 93
9 7 68 35
10 15 71 36
11 39 13 43
12 10 72 61
13 29 95 14
14 70 42 76
15 61 50 63
16 45 88 52
17 25 4 25
18 16 19 17
19 35 57 64
20 46 44 67

Example

 Live Demo

y1<-sample(0:100,20)
y2<-sample(0:100,20)
y3<-sample(0:100,20)
df2<-data.frame(y1,y2,y3)
df2

Output

  y1 y2 y3
1 80 8 10
2 23 46 89
3 43 81 64
4 22 57 68
5 16 18 50
6 78 22 11
7 34 28 5
8 81 38 37
9 99 26 94
10 21 74 44
11 10 40 52
12 26 32 98
13 11 49 2
14 52 23 1
15 1 61 62
16 33 96 82
17 32 71 70
18 57 73 87
19 48 25 60
20 89 41 90

Finding the correlation coefficient between rows of data frame df1 and data frame df2 −

Example

sapply(1:nrow(as.matrix(df1)), function(i) cor(as.matrix(df1)[i,], as.matrix(df2)[i,]))

Output

[1] 0.36890690 -0.91625073 0.30193606 -0.42728802 0.81965616 0.30896698
[7] -0.76093117 -0.47727710 -0.91512959 0.99774017 -0.11918743 0.41844066
[13] 0.99997923 -0.08688539 -0.35782508 0.77650429 -0.51934914 0.36342738
[19] 0.05387382 0.58105192

Updated on: 18-Oct-2020

621 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements