- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to create a data frame of the maximum value for each group in an R data frame using dplyr?
Sometimes subsetting of group wise maximum values is required while doing the data analysis and this subset of the data frame is used for comparative analysis. The main objective is to compare these maximums with each other or with a threshold value. In R, we can find the group wise maximum value by using group_by and slice functions in dplyr package.
Example
Consider the below data frame −
> x<-rep(c("S1","S2","S3","S4"),times=c(6,12,12,10)) > y<-1:40 > df<-data.frame(x,y) > head(df,20) x y 1 S1 1 2 S1 2 3 S1 3 4 S1 4 5 S1 5 6 S1 6 7 S2 7 8 S2 8 9 S2 9 10 S2 10 11 S2 11 12 S2 12 13 S2 13 14 S2 14 15 S2 15 16 S2 16 17 S2 17 18 S2 18 19 S3 19 20 S3 20
Loading dplyr package −
> library(dplyr)
Finding the data frame of maximum values using group_by and slice −
> Group_Maximum_df<-df%>%group_by(x)%>%slice(which.max(y)) > Group_Maximum_df # A tibble: 4 x 2 # Groups: x [4] x y <fct> <int> 1 S1 6 2 S2 18 3 S3 30 4 S4 40
In the above example,, we have a factor variable. Now let’s consider that instead of a factor variable, our group variable is represented with integer values. In this case as well, the procedure to find the data frame of maximum values will be same −
> a<-rep(c(1,2,3,4,5),times=c(5,10,10,5,10)) > b<-rep(c(25,23,21,24),times=c(10,10,10,10)) > df2<-data.frame(a,b) > head(df2,20) a b 1 1 25 2 1 25 3 1 25 4 1 25 5 1 25 6 2 25 7 2 25 8 2 25 9 2 25 10 2 25 11 2 23 12 2 23 13 2 23 14 2 23 15 2 23 16 3 23 17 3 23 18 3 23 19 3 23 20 3 23 > Group_Maximum_df2<-df2%>%group_by(a)%>%slice(which.max(b)) > Group_Maximum_df2 # A tibble: 5 x 2 # Groups: a [5] a b <dbl> <dbl> 1 1 25 2 2 25 3 3 23 4 4 21 5 5 24
- Related Articles
- How to create a column with column name for maximum value in each row of an R data frame?
- How to find the absolute maximum of each group in R data frame?
- How to select rows with group wise minimum or maximum values of a variable in an R data frame using dplyr?
- How to create a group column in an R data frame?
- Create a quartile column for each value in an R data frame column.
- How to find the maximum value in an R data frame?
- How to find the mean of row values in an R data frame using dplyr?
- How to filter column values for some strings from an R data frame using dplyr?
- How to extract columns of a data frame in R using dplyr package?
- How to find the maximum of each row in an R data frame?
- How to remove multiple rows from an R data frame using dplyr package?
- How to create a new data frame for the mean of rows of some columns from an R data frame?
- How to collapse data frame rows in R by summing using dplyr?
- How to subset rows of data frame without NA using dplyr in R?
- How to create group names for consecutively duplicate values in an R data frame column?

Advertisements