How to find the total by year column in an R data frame?


To find the total by year column in an R data frame, we can use aggregate function with sum function.

For Example, if we have a data frame called df that contains a year colmn say Year and a numerical column say Demand then we can find the total Demand by Year with the help of command given below −

aggregate(df["Demand"],by=df["Year"],sum)

Example 1

Following snippet creates a sample data frame −

Year<-sample(2001:2005,20,replace=TRUE)
Sales<-sample(500:1000,20)
df1<-data.frame(Year,Sales)
df1

The following dataframe is created

  Year Sales
 1 2001 537
 2 2005 742
 3 2003 551
 4 2003 590
 5 2001 792
 6 2003 985
 7 2003 765
 8 2003 993
 9 2003 764
10 2003 855
11 2001 959
12 2004 607
13 2002 555
14 2002 566
15 2005 596
16 2003 714
17 2005 846
18 2004 910
19 2005 849
20 2002 740

To find total Sales by Year on the above created data frame, add the following code to the above snippet −

Year<-sample(2001:2005,20,replace=TRUE)
Sales<-sample(500:1000,20)
df1<-data.frame(Year,Sales)
aggregate(df1["Sales"],by=df1["Year"],sum)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

 Year Sales
1 2001 2288
2 2002 1861
3 2003 6217
4 2004 1517
5 2005 3033

Example 2

Following snippet creates a sample data frame −

Years<-sample(2011:2016,20,replace=TRUE)
GDP_Variation<-sample(1:10,20,replace=TRUE)
df2<-data.frame(Years,GDP_Variation)
df2

The following dataframe is created

 Years  GDP_Variation
 1 2011      10
 2 2011       7
 3 2014       3
 4 2016       8
 5 2012      10
 6 2016       9
 7 2011       9
 8 2013       7
 9 2016       3
10 2016       6
11 2016       6
12 2012       3
13 2013       6
14 2015       5
15 2013       1
16 2011       8
17 2013       4
18 2015       5
19 2016       7
20 2013       8

To find total GDP_Variation by Years on the above created data frame, add the following code to the above snippet −

Years<-sample(2011:2016,20,replace=TRUE)
GDP_Variation<-sample(1:10,20,replace=TRUE)
df2<-data.frame(Years,GDP_Variation)
aggregate(df2["GDP_Variation"],by=df2["Years"],sum)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

  Years GDP_Variation
1 2011      34
2 2012      13
3 2013      26
4 2014       3
5 2015      10
6 2016      39

Updated on: 08-Nov-2021

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements