How to find the correlation matrix with p-values for an R data frame?


The correlation matrix with p-values for an R data frame can be found by using the function rcorr of Hmisc package and read the output as matrix. For example, if we have a data frame called df then the correlation matrix with p-values can be found by using rcorr(as.matrix(df)).

Example

Consider the below data frame −

 Live Demo

df1<-data.frame(x1=rnorm(20),x2=rnorm(20),x3=rnorm(20,5,1.01))
df1

Output

         x1         x2           x3
1   -0.73652050  -0.63217859   5.185969
2   -2.01419490   0.34607185   4.481368
3    0.34235643   0.57178015   3.026729
4   -0.08378474   0.92191817   4.250791
5    0.83898327   0.07942875   3.211523
6   -0.02279024   0.63411399   5.194953
7    1.41945942   1.11677804   4.615502
8   -0.02649611  -0.12183326   5.823864
9    1.51668723  -0.05485461   4.569037
10  -1.26449629  -1.02647482   4.592757
11  -0.12044314  -0.44292038   6.876043
12  -0.45473954  -1.05514259   7.532017
13  -1.11612369  -1.15571563   5.815840
14  -0.60718723   0.67048948   5.429381
15  -0.36208570   1.16795697   5.338567
16  -0.07388237   0.66417818   3.412541
17  -0.76607395   0.38185805   5.127318
18   1.21366135  -1.58142860   6.173194
19   1.01896222   1.97880129   5.418979
20  -1.23818051  -0.99555593   6.024320

Loading Hmisc package and creating correlation matrix with p-values for data in df1 −

Example

library(Hmisc)
rcorr(as.matrix(df1),type="pearson")

Output

     x1   x2     x3
x1 1.00  0.29  -0.18
x2 0.29  1.00  -0.45
x3 -0.18 -0.45  1.00
n= 20
P
    x1     x2      x3
x1       0.2222  0.4548
x2 0.2222        0.0470
x3 0.4548 0.0470

Example

 Live Demo

df2<-data.frame(y1=rpois(20,5),y2=rpois(20,8),y3=rpois(20,2),y4=rpois(20,5))
df2

Output

   y1  y2 y3 y4
1  4  14  2  8
2  5  15  1  6
3  7   9  2  3
4  5  13  1  5
5  5   4  1  2
6  5   8  1  4
7  3   7  0  3
8  7  13  3  2
9  9  12  7  7
10 5  11  3  3
11 4   8  2  1
12 1   6  2  1
13 9   5  3  5
14 4   6  3  2
15 5   5  0  1
16 6   9  1  7
17 3   2  3  9
18 4   9  4  4
19 5  11  5  9
20 6   9  2  8

Creating correlation matrix with p-values for data in df2 −

Example

rcorr(as.matrix(df2),type="pearson")

Output

    y1     y2    y3    y4
y1  1.00  0.26  0.38   0.21
y2  0.26  1.00  0.19   0.27
y3  0.38  0.19  1.00   0.36
y4  0.21  0.27  0.36   1.00
n= 20
P
      y1    y2      y3     y4
y1        0.2712 0.0980  0.3737
y2 0.2712        0.4101  0.2490
y3 0.0980 0.4101         0.1152
y4 0.3737 0.2490 0.1152

Updated on: 16-Mar-2021

8K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements