How to find total of an integer column based on two different character columns in R?


The calculation of total for integer column based on two different character columns simply means that we need to create a contingency table for the available data. For this purpose, we can use with and tapply function. For example, if we have a data frame df that contains two categorical columns defined as gender and ethnicity and an integer column defined as Package then the contingency table can be created as:

with(df,tapply(Package,list(gender,ethnicity),sum))

Example

Consider the below data frame −

 Live Demo

set.seed(777)
Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Group<−sample(c("GP1","GP2","GP3","GP4"),20,replace=TRUE)
Rate<−sample(0:10,20,replace=TRUE)
df1<−data.frame(Class,Group,Rate)
df1

Output

   Class Group Rate
1 First   GP1 7
2 Second  GP2 1
3 Second  GP4 1
4 Second  GP4 0
5 Third   GP2 10
6 Second  GP2 8
7 First   GP1 7
8 First   GP4 4
9 Second  GP1 4
10 Third  GP3 8
11 Second GP2 8
12 First  GP2 4
13 Third  GP2 6
14 Third  GP4 4
15 Third  GP4 5
16 Second GP1 2
17 Second GP1 9
18 Second GP3 2
19 Second GP3 1
20 Third  GP4 10

Example

str(df1)
'data.frame': 20 obs. of 3 variables:
$ Class: chr "First" "Second" "Second" "Second" ...
$ Group: chr "GP1" "GP2" "GP4" "GP4" ...
$ Rate : int 7 1 1 0 10 8 7 4 4 8 ...

Finding the total of Rate based on Class and Group −

with(df1,tapply(Rate,list(Class,Group),sum))
GP1 GP2 GP3 GP4
First  14 4 NA 4
Second 15 17 3 1
Third  NA 16 8 19

Let’s have a look at another example −

Example

 Live Demo

Gender<−sample(c("Male","Female"),20,replace=TRUE)
Centering<−sample(c("Yes","No"),20,replace=TRUE)
Percentage<−sample(1:100,20)
df2<−data.frame(Gender,Centering,Percentage)
df2

Output

Gender Centering Percentage
1 Male    No  28
2 Male    No  89
3 Female  Yes 38
4 Male    No  78
5 Male    Yes 19
6 Female  No  46
7 Female  Yes 94
8 Male    No   4
9 Male    Yes 92
10 Male   No  90
11 Male   Yes 66
12 Female No  57
13 Female No  74
14 Female No  48
15 Female Yes 20
16 Male   Yes 51
17 Male   No  82
18 Male   No   7
19 Male   No  53
20 Male   No  55

Finding the total of Percentage based on Gender and Centering −

with(df2,tapply(Percentage,list(Gender,Centering),sum))
No Yes
Female 225 152
Male 486 228

Updated on: 17-Oct-2020

47 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements