How to create a table of sums of a discrete variable for two categorical variables in an R data frame?


If we want to create a table of sums of a discrete variable for two categorical variables then xtabs function can be used. The output will be a contingency table or cross tabulation table which looks like a matrix. For example, if we have a data frame df with two categorical column x and y and a count column freq then the table of sums for freq can be created by using xtabs(freq~x+y,data=df1).

Example

Consider the below data frame −

 Live Demo

x1<-sample(c("A","B"),20,replace=TRUE)
x2<-sample(c("I","II"),20,replace=TRUE)
y1<-rpois(20,5)
df1<-data.frame(x1,x2,y1)
df1

Output

  x1 x2 y1
1  A II 10
2  A I   5
3  B II  7
4  B I   5
5  B I   7
6  A II  1
7  B II  2
8  B II  3
9  B I   8
10 A I   5
11 A II  8
12 A II  4
13 B I   7
14 B II  4
15 B II  3
16 A I   6
17 A I   4
18 A I   5
19 A II  8
20 A II  7

The creation of cross-tabulation −

Example

xtabs(y1~x1+x2,data=df1)
x2

Output

x1 I II
A 25 38
B 27 19

Example

 Live Demo

z1<-sample(c("G1","G2","G3","G4"),20,replace=TRUE)
z2<-sample(c("S1","S2","S3"),20,replace=TRUE)
y2<-sample(1:10,20,replace=TRUE)
df2<-data.frame(z1,z2,y2)
df2

Output

  z1 z2 y2
1  G2 S1 3
2  G2 S2 9
3  G2 S1 7
4  G2 S3 7
5  G4 S2 3
6  G3 S2 7
7  G2 S2 10
8  G3 S3 1
9  G1 S1 3
10 G4 S1 10
11 G3 S2 4
12 G3 S2 9
13 G2 S1 6
14 G1 S3 3
15 G2 S3 9
16 G1 S3 4
17 G1 S1 9
18 G3 S1 4
19 G4 S2 9
20 G1 S1 6

The creation of cross-tabulation −

Example

xtabs(y2~z1+z2,data=df2)
z2

Output

z1 S1 S2 S3
G1 18 0  7
G2 16 19 16
G3 4 20  1
G4 10 12 0

Updated on: 07-Dec-2020

383 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements