How to find the proportion of categories based on another categorical column in R's data.table object?

R ProgrammingServer Side ProgrammingProgramming

To find the proportion of categories based on another categorical column in R's data.table object, we can follow the below steps −

  • First of all, create a data.table object.
  • Finding the proportion based on categorical column.

Create a data.table object

Loading data.table package and creating a data.table object with two categorical columns −

library(data.table)
Category1<-sample(LETTERS[1:3],30,replace=TRUE)
Category2<-sample(letters[1:4],30,replace=TRUE)
DT<-data.table(Category1,Category2)
DT

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

   Category1 Category2
1:    C       d
2:    B       a
3:    C       d
4:    C       a
5:    B       c
6:    C       b
7:    B       a
8:    B       b
9:    B       c
10:    B       d
11:    B       a
12:    A       d
13:    C       d
14:    B       a
15:    A       a
16:    C       a
17:    B       d
18:    C       c
19:    C       c
20:    A       b
21:    A       d
22:    C       c
23:    B       d
24:    C       c
25:    A       d
26:    A       b
27:    B       a
28:    A       c
29:    B       d
30:    A       d
   Category1 Category2

Finding the proportion based on categorical column

Use tabulate function with factor function to find the proportion of Category2 values in Category1 values −

library(data.table)
Category1<-sample(LETTERS[1:3],30,replace=TRUE)
Category2<-sample(letters[1:4],30,replace=TRUE)
DT<-data.table(Category1,Category2)
DT[order(Category2),.(Category2=letters[1:4],Proportion=tabulate(factor(Category2))/.N
),by=Category1]
   Category1 Category2    Proportion
1:    B          a       0.41666667
2:    B          b       0.08333333
3:    B          c       0.16666667
4:    B          d       0.33333333
5:    C          a       0.20000000
6:    C          b       0.10000000
7:    C          c       0.40000000
8:    C          d       0.30000000
9:    A          a       0.12500000
10:   A          b       0.25000000
11:   A          c       0.12500000
12:   A          d       0.50000000
raja
Published on 05-Aug-2021 13:25:10
Advertisements