How to find the sum of values based on key in other column of an R data frame?


If we have a column that is key that means we want to use that column as an independent variable and find the statistical values such as sum, mean, standard deviation, range, etc. for the dependent variable. This can be done with the combination of with and tapply function as shown in the below examples.

Consider the below data frame −

Example

 Live Demo

x1<-sample(c("A","B","C"),20,replace=TRUE)
y1<-rpois(20,5)
df1<-data.frame(x1,y1)
df1

Output

   x1  y1
1  C   0
2  A   4
3  C   5
4  C   5
5  A   5
6  C   3
7  B   7
8  B   6
9  C   6
10 C   13
11 C   6
12 C   5
13 C   6
14 A   7
15 B   4
16 C   1
17 C   7
18 B   6
19 B   3
20 B   5

Finding the sum of y1 for values in x1 −

with(df1,tapply(y1,x1,FUN=sum))

A B C
16 31 57

Example

 Live Demo

x2<-sample(c("India","Indonesia","UK"),20,replace=TRUE)
y2<-rpois(20,10)
df2<-data.frame(x2,y2)
df2

Output

    x2        y2
1  India      11
2  India       8
3  Indonesia  16
4  India       8
5  Indonesia  10
6  UK         16
7  India      16
8  Indonesia   9
9  Indonesia  11
10 India       9
11 UK          7 
12 India      14
13 Indonesia   9
14 India      12
15 UK          8
16 Indonesia  10
17 UK         14
18 India       9
19 India      13
20 Indonesia  10

Finding the sum of y2 for values in x2 −

with(df2,tapply(y2,x2,FUN=sum))

India Indonesia UK
100     75      45

Updated on: 06-Feb-2021

495 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements