How to find the correlation matrix by considering only numerical columns in an R data frame?

R ProgrammingServer Side ProgrammingProgramming

While we calculate correlation matrix for a data frame, all the columns must be numerical, if that is not the case then we get an error Error in cor(“data_frame_name”) : 'x' must be numeric. To solve this problem, either we can find the correlations among variables one by one or use apply function.

Example

Consider the below data frame −

set.seed(99)
x1<-rnorm(20)
x2<-rpois(20,5)
x3<-rpois(20,2)
x4<-LETTERS[1:20]
x5<-runif(20,2,10)
x6<-sample(letters[1:3],20,replace=TRUE)
df<-data.frame(x1,x2,x3,x4,x5,x6)
df

Output

   x1    x2    x3    x4 x5 x6
1 0.2139625022 7 4 A 6.423159 a
2 0.4796581346 5 1 B 7.176488 a
3 0.0878287050 7 2 C 2.372402 c
4 0.4438585075 8 3 D 6.599771 a
5 -0.3628379205 5 2 E 5.122577 c
6 0.1226740295 8 3 F 3.133224 c
7 -0.8638451881 4 2 G 2.482256 a
8 0.4896242667 4 4 H 4.532982 c
9 -0.3641169125 5 0 I 2.670717 c
10 -1.2942420067 2 3 J 8.597253 a
11 -0.7457690454 6 0 K 2.699053 a
12 0.9215503620 7 2 L 8.743498 b
13 0.7500543504 6 2 M 3.427915 c
14 -2.5085540159 10 2 N 5.928563 a
15 -3.0409340953 4 2 O 3.544168 a
16 0.0002658005 7 0 P 3.710395 c
17 -0.3940189942 2 2 Q 9.609634 c
18 -1.7450276608 5 0 R 5.886087 b
19 0.4986314508 8 2 S 5.507034 c
20 0.2709537888 4 3 T 2.137873 b

Finding the correlation matrix for columns in df −

cor(df)
Error in cor(df) : 'x' must be numeric

Here, the error means all the columns are not numeric.

str(df)
'data.frame': 20 obs. of 6 variables:
$ x1: num 0.214 0.4797 0.0878 0.4439 -0.3628 ...
$ x2: int 7 5 7 8 5 8 4 4 5 2 ...
$ x3: int 4 1 2 3 2 3 2 4 0 3 ...
$ x4: Factor w/ 20 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10 ...
$ x5: num 6.42 7.18 2.37 6.6 5.12 ...
$ x6: Factor w/ 3 levels "a","b","c": 1 1 3 1 3 3 1 3 3 1 ...

Now to find the correlation matrix for all the numeric columns, we can do the following −

Example

cor(df[sapply(df,is.numeric)])

Output

      x1    x2    x3    x5
x1 1.00000000 0.14685889 0.23107456 0.04232205
x2 0.14685889 1.00000000 -0.02664914 -0.14822679
x3 0.23107456 -0.02664914 1.00000000 0.18971761
x5 0.04232205 -0.14822679 0.18971761 1.00000000
raja
Published on 24-Aug-2020 15:32:33
Advertisements