How to extract the row for groupwise maximum in another column of an R data.table object?

R ProgrammingServer Side ProgrammingProgramming

To extract the row for groupwise maximum in another column of an R data.table object, can make use of which.max function by defining the grouping column. It means that if we have a categorical/grouping column and a numerical column then we groupwise maximum will be the maximum for each grouping level in the numerical column and we can extract the row based on these two columns. Check out the examples to understand how it works.

Example1

Loading data.table package and creating a data.table object −

> library(data.table)
> x1<-sample(c("A","B","C"),20,replace=TRUE)
> x2<-rpois(20,5)
> x3<-rpois(20,2)
> DT1<-data.table(x1,x2,x3)
> DT1

Output

   x1 x2 x3

1:  B  3  2 2:  C  6  0 3:  B  4  1 4:  C  8  3 5:  A  3  6 6:  B  5  3 7:  C  4  1 8:  B  4  0 9:  B  5  2 10: C  6  1 11: A  4  1 12: C  5  0 13: B  2  3 14: A  5  0 15: C  8  6 16: A  5  2 17: B  4  2 18: A  3  3 19: C 10  2 20: C  3  2

Extracting groupwise maximum rows from DT1 −

> DT1[,.SD[which.max(x2)],by=x1]

Output

   x1 x2 x3
1:  B  5  3
2:  C 10  2
3:  A  5  0

Example2

> y1<-sample(c("Male","Female"),20,replace=TRUE)
> y2<-rnorm(20)
> y3<-rnorm(20)
> DT2<-data.table(y1,y2,y3)
> DT2

Output

        y1          y2         y3
 1: Female  0.09094138 -0.4011408
 2:   Male -0.51845798  0.9946824
 3:   Male  0.73189425  0.2013690
 4:   Male  0.58616939  0.6290771
 5:   Male  2.53714401 -0.9434801
 6: Female -0.98726606 -0.9564542
 7:   Male  1.28230337  0.2018570
 8: Female -0.60125038  1.0522084
 9: Female  1.06912678 -0.3825166
10: Female  0.99567103 -0.1200035
11:   Male  0.66163046 -0.3596741
12:   Male -0.62465260  2.2215039
13:   Male  2.09315525  1.4402211
14:   Male -1.18256083  0.3528192
15:   Male -0.36751044  0.4837127
16:   Male -0.23044236 -0.8761699
17:   Male -0.84228258 -0.5922790
18: Female  0.80129337  1.5403199
19:   Male  0.76037129 -0.4590728
20: Female  0.17482961  0.3189389

Extracting groupwise maximum rows from DT2 −

> DT2[,.SD[which.max(y3)],by=y1]

Output

       y1         y2       y3
1: Female  0.8012934 1.540320
2:   Male -0.6246526 2.221504
raja
Published on 05-Mar-2021 11:07:30
Advertisements