How to find the correlation matrix of groups for a data.table object in R?

R ProgrammingServer Side ProgrammingProgramming

To find the correlation of groups, we can use cor function but it cannot be directly used.

For this purpose, we first need to set they key for group column of data table object. For example, if we have a data.table DT with one numerical column defined as x and one group column defined as Group having 4 groups as a, b, c, and d then the correlation of numerical values for groups a and b can be found as −

setkey(DT,Group)
cor(DT["a"]$x,DT["b"]$x)

library(data.table)

Example

Consider the below data.table object −

x<-rnorm(20,1,0.04)
Class<-rep(LETTERS[1:2],10)
DT1<-data.table(x,Class)
DT1

Output

x    Class
1: 1.0315869 A
2: 1.0240505 B
3: 0.9820461 A
4: 1.0095865 B
5: 1.0025895 A
6: 1.0076078 B
7: 1.0266381 A
8: 0.9735519 B
9: 1.0457029 A
10: 1.0407300 B
11: 1.0384560 A
12: 0.9798408 B
13: 0.9810080 A
14: 1.0602431 B
15: 0.9968140 A
16: 1.0239540 B
17: 0.9675810 A
18: 1.0723230 B
19: 0.9705898 A
20: 1.0713552 B

Finding a correlation between A and B Class −

Example

setkey(DT1,Class)
cor(DT1["A"]$x,DT1["B"]$x)

 -0.6282066

Example

y<-rpois(20,5)
Group<-rep(c("S1","S2","S3","S4"),5)
DT2<-data.table(y,Group)
DT2

y Group
1: 3 S1
2: 3 S2
3: 5 S3
4: 7 S4
5: 9 S1
6: 6 S2
7: 7 S3
8: 6 S4
9: 4 S1
10: 5 S2
11: 6 S3
12: 4 S4
13: 9 S1
14: 6 S2
15: 4 S3
16: 6 S4
17: 8 S1
18: 5 S2
19: 2 S3
20: 1 S4

Example

setkey(DT2,Group)
cor(DT2["S1"]$y,DT2["S2"]$y)

 0.8502303

Example

cor(DT2["S1"]$y,DT2["S3"]$y)

 -0.1984965

Example

cor(DT2["S1"]$y,DT2["S4"]$y)

 -0.1962715

Example

cor(DT2["S2"]$y,DT2["S3"]$y)

 0.1061191

Example

cor(DT2["S2"]$y,DT2["S4"]$y)

 -0.1709964

Example

cor(DT2["S3"]$y,DT2["S4"]$y)

 0.6423677