- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to select rows with group wise minimum or maximum values of a variable in an R data frame using dplyr?
If an R data frame contains a group variable that has many group levels then finding the minimum and maximum values of a discrete or continuous variable based on the group levels becomes difficult. But this can be done with slice function in dplyr package.
Consider the below data frame that has one group variable and continuous as well as discrete variables −
> set.seed(2) > x1<-sample(1:100,20,replace=TRUE) > x2<-sample(1:10,20,replace=TRUE) > x3<-rpois(20,10) > x4<-rpois(20,5) > x5<-rpois(20,5) > x6<-runif(20,2,5) > x7<-sample(200:1000,20,replace=TRUE) > Group<-rep(c(1,2,3,4),times=5) > df<-data.frame(x1,x2,x3,x4,x5,x6,x7,Group) > df x1 x2 x3 x4 x5 x6 x7 Group 1 85 8 14 7 8 2.900301 749 1 2 79 7 12 4 3 3.331022 200 2 3 70 1 17 5 6 4.190603 883 3 4 6 6 11 8 5 4.004491 649 4 5 32 9 13 5 4 2.934971 641 1 6 8 4 7 3 6 3.435734 699 2 7 17 6 9 3 4 2.874230 679 3 8 93 9 7 1 3 2.546523 642 4 9 81 8 8 6 8 3.082288 496 1 10 76 6 7 4 1 4.711400 570 2 11 41 3 8 5 9 3.182143 847 3 12 50 9 9 7 4 4.339642 707 4 13 75 7 8 7 6 2.852477 805 1 14 65 8 16 8 5 4.561162 233 2 15 3 6 8 2 7 2.516727 783 3 16 80 2 9 9 3 2.237793 788 4 17 96 7 10 5 4 2.876195 792 1 18 50 2 7 1 3 4.521114 375 2 19 55 3 6 3 4 4.835804 942 3 20 63 4 9 10 6 2.134896 228 4
Loading the dplyr package −
> library(dplyr)
Finding the group wise rows for minimum and maximum values of a particular variable −
> df %>% group_by(Group) %>% slice(which.min(x5)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 32 9 13 5 4 2.93 641 1 2 76 6 7 4 1 4.71 570 2 3 17 6 9 3 4 2.87 679 3 4 93 9 7 1 3 2.55 642 4 > df %>% group_by(Group) %>% slice(which.max(x5)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 85 8 14 7 8 2.90 749 1 2 8 4 7 3 6 3.44 699 2 3 41 3 8 5 9 3.18 847 3 4 63 4 9 10 6 2.13 228 4 > df %>% group_by(Group) %>% slice(which.max(x7)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 75 7 8 7 6 2.85 805 1 2 8 4 7 3 6 3.44 699 2 3 55 3 6 3 4 4.84 942 3 4 80 2 9 9 3 2.24 788 4 > df %>% group_by(Group) %>% slice(which.min(x1)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 32 9 13 5 4 2.93 641 1 2 8 4 7 3 6 3.44 699 2 3 3 6 8 2 7 2.52 783 3 4 6 6 11 8 5 4.00 649 4 > df %>% group_by(Group) %>% slice(which.max(x1)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 96 7 10 5 4 2.88 792 1 2 79 7 12 4 3 3.33 200 2 3 70 1 17 5 6 4.19 883 3 4 93 9 7 1 3 2.55 642 4 > df %>% group_by(Group) %>% slice(which.max(x6)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 81 8 8 6 8 3.08 496 1 2 76 6 7 4 1 4.71 570 2 3 55 3 6 3 4 4.84 942 3 4 50 9 9 7 4 4.34 707 4 > df %>% group_by(Group) %>% slice(which.min(x6)) # A tibble: 4 x 8 # Groups: Group [4] x1 x2 x3 x4 x5 x6 x7 Group <int> <int> <int> <int> <int> <dbl> <int> <dbl> 1 75 7 8 7 6 2.85 805 1 2 79 7 12 4 3 3.33 200 2 3 3 6 8 2 7 2.52 783 3 4 63 4 9 10 6 2.13 228 4
- Related Articles
- How to create a data frame of the maximum value for each group in an R data frame using dplyr?
- How to remove multiple rows from an R data frame using dplyr package?
- How to subset rows of data frame without NA using dplyr in R?
- How to collapse data frame rows in R by summing using dplyr?
- How to find the mean of row values in an R data frame using dplyr?
- Find the group wise large and small values in an R data frame.
- How to select rows based on range of values of a column in an R data frame?
- How to find the minimum and maximum of columns values in an R data frame?
- How to filter column values for some strings from an R data frame using dplyr?
- How to find group-wise summary statistics for an R data frame?
- How to select positive values in an R data frame column?
- How to extract columns of a data frame in R using dplyr package?
- How to select rows of an R data frame that are non-NA?
- How to select rows of a data frame that are not in other data frame in R?
- How to perform group-wise linear regression for a data frame in R?

Advertisements