How to find the sum by distinct column for factor levels in an R data frame?

R ProgrammingServer Side ProgrammingProgramming

If the data frame contains a factor column and some numerical columns then we might want to find the sum of numerical columns for the factor levels. For this purpose, we can use aggregate function. For example, if we have a data frame df that contains a factor column defined by Group and some numerical columns then the sum by distinct column for factor levels can be calculated by using aggregate(.~Group,data=df,sum)

Example1

 Live Demo

Consider the below data frame −

Group<−factor(sample(c("A","B","C"),20,replace=TRUE))
frequency<−sample(1:10,20,replace=TRUE)
cost<−round(rnorm(20,25,6),2)
df1<−data.frame(Group,frequency,cost)
df1

Output

  Group frequency cost
1  A    6        21.69
2  C    5        34.94
3  C    3        17.32
4  B    3        16.84
5  A    10       23.10
6  C    3        30.30
7  B    8        19.84
8  A    1        25.41
9  C    2        27.55
10 A    10       26.31
11 B    7        33.05
12 A    10       32.09
13 B    1        27.36
14 A    9        19.70
15 A    5        26.44
16 A    10       28.28
17 C    6        25.67
18 A    9        24.06
19 C    3        22.25
20 A    5        24.93

Finding the sum of levels in Group for frequency and cost −

Example

aggregate(.~Group,data=df1,sum)
Group frequency cost

Output

1 A 75 252.01
2 B 19 97.09
3 C 22 158.03

Example2

 Live Demo

Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Price<−sample(2000:5000,20)
Seats<−sample(0:9,20,replace=TRUE)
df2<−data.frame(Class,Price,Seats)
df2

Output

Class Price Seats
1 Third 2218 4
2 Second 3064 4
3 Third 4074 2
4 First 4394 4
5 First 2321 3
6 Third 4998 1
7 First 3520 2
8 First 4133 1
9 Third 4832 9
10 Second 2856 0
11 Third 3145 7
12 Third 4604 6
13 Second 4691 9
14 First 4994 4
15 Third 2252 2
16 First 3491 0
17 Second 4125 7
18 Second 2597 2
19 Third 3720 3
20 Second 2995 0

Finding the sum of levels in Class for Price and Seats −

Example

aggregate(.~Class,data=df2,sum)

Output

Class Price Seats
1 First 22853 14
2 Second 20328 22
3 Third 29843 34
raja
Published on 05-Feb-2021 10:43:48
Advertisements