# How to find the correlation matrix of groups for a data.table object in R?

R ProgrammingServer Side ProgrammingProgramming

To find the correlation of groups, we can use cor function but it cannot be directly used.

For this purpose, we first need to set they key for group column of data table object. For example, if we have a data.table DT with one numerical column defined as x and one group column defined as Group having 4 groups as a, b, c, and d then the correlation of numerical values for groups a and b can be found as −

setkey(DT,Group)
cor(DT["a"]$x,DT["b"]$x)

library(data.table)

## Example

Consider the below data.table object −

x<-rnorm(20,1,0.04)
Class<-rep(LETTERS[1:2],10)
DT1<-data.table(x,Class)
DT1

## Output

      x    Class
1: 1.0315869 A
2: 1.0240505 B
3: 0.9820461 A
4: 1.0095865 B
5: 1.0025895 A
6: 1.0076078 B
7: 1.0266381 A
8: 0.9735519 B
9: 1.0457029 A
10: 1.0407300 B
11: 1.0384560 A
12: 0.9798408 B
13: 0.9810080 A
14: 1.0602431 B
15: 0.9968140 A
16: 1.0239540 B
17: 0.9675810 A
18: 1.0723230 B
19: 0.9705898 A
20: 1.0713552 B

Finding a correlation between A and B Class −

## Example

setkey(DT1,Class)
cor(DT1["A"]$x,DT1["B"]$x)

## Output

 -0.6282066

## Example

y<-rpois(20,5)
Group<-rep(c("S1","S2","S3","S4"),5)
DT2<-data.table(y,Group)
DT2

## Output

   y Group
1: 3 S1
2: 3 S2
3: 5 S3
4: 7 S4
5: 9 S1
6: 6 S2
7: 7 S3
8: 6 S4
9: 4 S1
10: 5 S2
11: 6 S3
12: 4 S4
13: 9 S1
14: 6 S2
15: 4 S3
16: 6 S4
17: 8 S1
18: 5 S2
19: 2 S3
20: 1 S4

## Example

setkey(DT2,Group)
cor(DT2["S1"]$y,DT2["S2"]$y)

## Output

 0.8502303

## Example

cor(DT2["S1"]$y,DT2["S3"]$y)

## Output

 -0.1984965

## Example

cor(DT2["S1"]$y,DT2["S4"]$y)

## Output

 -0.1962715

## Example

cor(DT2["S2"]$y,DT2["S3"]$y)

## Output

 0.1061191

## Example

cor(DT2["S2"]$y,DT2["S4"]$y)

## Output

 -0.1709964

## Example

cor(DT2["S3"]$y,DT2["S4"]$y)

## Output

 0.6423677