How to find the correlation coefficient between two data frames in R?



If two data frames in R have equal number of columns then we can find the correlation coefficient among the columns of these data frames which will be the correlation matrix. For example, if we have a data frame df1 that contains column x and y and another data frame df2 that contains column a and b then the correlation coefficient between df1 and df2 can be found by cor(df1,df2).

Example1

Consider the below data frame:

Live Demo

> x1<-rnorm(20,40,1)
> x2<-rnorm(20,40,2.5)
> df1<-data.frame(x1,x2)
> df1

Output

     x1      x2
1 39.56630 38.25632
2 39.43689 44.14647
3 40.80479 37.43309
4 40.34051 39.99801
5 40.35843 32.90392
6 39.00226 37.35173
7 39.50567 41.58829
8 40.62072 40.15825
9 40.87509 40.95915
10 40.00141 41.61430
11 40.66278 42.94636
12 41.73270 39.31584
13 40.85441 40.49112
14 39.48948 45.01913
15 38.99657 39.62922
16 37.94110 37.74148
17 40.27031 38.78546
18 38.99950 40.38444
19 40.72692 38.71749
20 39.40853 41.04819

Example

Live Demo

> y1<-rnorm(20,1,0.47)
> y2<-rnorm(20,1,0.59)
> df2<-data.frame(y1,y2)
> df2

Output

     y1       y2
1 0.9838238 0.68734717
2 1.3925584 1.36682711
3 0.7476216 0.79403604
4 -0.1170126 0.45490447
5 1.3735461 1.28769736
6 0.4054685 1.24869506
7 0.2779903 0.97357550
8 1.6027345 1.46525577
9 1.3120895 1.70480214
10 1.3728221 0.83932208
11 1.2434638 1.42851893
12 1.4489997 0.77707573
13 1.1582931 -0.06776824
14 0.1890778 0.11686600
15 1.8483871 -0.23030292
16 1.5209849 0.26422644
17 1.2637409 1.24343600
18 1.1026349 1.12995474
19 0.4537390 0.62729603
20 0.4520326 0.77140826

Finding the correlation between df1 and df2:

Example

> cor(df1,df2)

Output

      y1         y2
x1 0.04218867 0.24817633
x2 -0.14992022 -0.04890168

Example2

Live Demo

> a1<-rpois(20,5)
> a2<-rpois(20,5)
> dfa<-data.frame(a1,a2)
> dfa

Output

a1 a2
1 7 2
2 6 4
3 8 1
4 4 9
5 7 7
6 1 5
7 10 9
8 9 5
9 6 4
10 2 4
11 4 6
12 3 7
13 8 9
14 5 8
15 3 7
16 8 6
17 5 10
18 6 6
19 4 6
20 0 5

Example

Live Demo

> b1<-rpois(20,2)
> b2<-rpois(20,2)
> dfb<-data.frame(b1,b2)
> dfb

Output

b1 b2
1 2 1
2 2 0
3 1 5
4 1 6
5 1 1
6 0 3
7 0 3
8 4 2
9 1 1
10 2 2
11 3 2
12 3 2
13 5 5
14 1 1
15 0 2
16 1 3
17 4 2
18 0 4
19 1 2
20 4 2

Finding the correlation between dfa and dfb:

Example

> cor(dfa,dfb)

Output

       b1        b2
a1 -0.02277452 0.1306828
a2 0.13002305 0.2173069

Advertisements