How to merge rows having same values in an R data frame?

R ProgrammingServer Side ProgrammingProgramming

To merge rows having same values in an R data frame, we can use the aggregate function.

For example, if we have a data frame called df that contains two categorical columns say C1 and C2 and one numerical column Num then we can merge the rows of df by summing the values in Num for the combination of values in C1 and C2 by using the below given command −

aggregate(Num~.,df,FUN=sum)

Example 1

Following snippet creates a sample data frame −

grp1<-sample(LETTERS[1:3],20,replace=TRUE)
grp2<-sample(letters[1:3],20,replace=TRUE)
Response<-rpois(20,5)
df1<-data.frame(grp1,grp2,Response)
df1

Output

The following dataframe is created −

 grp1 grp2 Response
1  A   b   2
2  A   c   2
3  B   a   8
4  B   c   7
5  C   a   7
6  C   b   6
7  C   b   3
8  B   c   6
9  A   a   7
10 A   c   7
11 C   a  11
12 A   b   7
13 B   c   4
14 A   c   2
15 B   a   2
16 B   c   2
17 A   b   2
18 B   c   5
19 C   a   4
20 C   b   2

In order to merge the rows of df1 by finding the sum of Response, add the following code to the above snippet −

grp1<-sample(LETTERS[1:3],20,replace=TRUE)
grp2<-sample(letters[1:3],20,replace=TRUE)
Response<-rpois(20,5)
df1<-data.frame(grp1,grp2,Response)
aggregate(Response~.,df1,FUN=sum)

Output

If you execute all the above given snippets as a single program, it generates the following output −

 grp1 grp2 Response
1  A   a    7
2  B   a   10
3  C   a   22
4  A   b   11
5  C   b   11
6  A   c   11
7  B   c   24

Example 2

Following snippet creates a sample data frame −

factor1<-sample(c("Male","Female"),20,replace=TRUE)
factor2<-sample(c("I","II","III"),20,replace=TRUE)
Score<-sample(1:50,20)
df2<-data.frame(factor1,factor2,Score)
df2

The following dataframe is created −

factor1  factor2 Score
1  Female III    32
2  Male   III    18
3  Male   III    21
4  Male   III    49
5  Male     I    38
6  Female III     6
7  Male     I     7
8  Female   I    44
9  Female III    30
10 Female   I    26
11 Male    II    43
12 Male   III    41
13 Female   I     4
14 Female   I    33
15 Male     I    15
16 Female  II    13
17 Female  III   19
18 Female  III   46
19 Male     II   17
20 Female    I   31

In order to merge the rows of df2 by finding the sum of Score, add the following code to the above snippet −

factor1<-sample(c("Male","Female"),20,replace=TRUE)
factor2<-sample(c("I","II","III"),20,replace=TRUE)
Score<-sample(1:50,20)
df2<-data.frame(factor1,factor2,Score)
aggregate(Score~.,df2,FUN=sum)

Output

If you execute all the above given snippets as a single program, it generates the following output −

   factor1 factor2 Score
1  Female   I      138
2  Male     I       60
3  Female  II       13
4  Male    II       60
5  Female III      133
6  Male   III      129
raja
Published on 05-Nov-2021 08:00:06

Advertisements