- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to subset rows that contains maximum depending on another column in R data frame?
To subset rows that contains maximum depending on another column in R data frame, we can follow the below steps −
- First of all, create a data frame with one numerical and one categorical column.
- Then, use tapply function with max function to find the rows that contains maximum in numerical column based on another column.
Example1
Create the data frame
Let's create a data frame as shown below −
x<-rnorm(20) factor1<-sample(LETTERS[1:4],20,replace=TRUE) df1<-data.frame(x,factor1) df1
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
x factor1 1 -1.21231516 A 2 -0.01576519 B 3 0.59032593 D 4 -0.41583339 C 5 -0.38508102 A 6 -0.61177209 C 7 -0.52961795 C 8 0.30561837 A 9 -0.58067776 A 10 0.62246173 C 11 -0.58479709 C 12 0.09817433 B 13 1.11240042 C 14 0.29007306 B 15 -0.66345792 B 16 -1.80789902 A 17 0.33419804 C 18 -0.15665767 A 19 1.56775923 C 20 1.49345799 B
Find the rows that contains maximum based on another column
Using tapply function to find the maximum of rows in column x based on factor1 column in df1 −
x<-rnorm(20) factor1<-sample(LETTERS[1:4],20,replace=TRUE) df1<-data.frame(x,factor1) tapply(df1$x,df1$factor1,max)
Output
A B C D 0.3056184 1.4934580 1.5677592 0.5903259
Example 2
Create the data frame
Let's create a data frame as shown below −
y<-sample(1:50,20) factor2<-sample(c("Low","Medium","High"),20,replace=TRUE) df2<-data.frame(y,factor2) df2
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
y factor2 1 45 Low 2 2 Medium 3 5 High 4 33 Low 5 28 High 6 37 Medium 7 7 High 8 21 High 9 48 Low 10 18 High 11 15 High 12 38 High 13 20 Medium 14 4 Low 15 22 Medium 16 34 Low 17 32 Low 18 29 Low 19 24 High 20 17 Medium
Find the rows that contains maximum based on another column
Using tapply function to find the maximum of rows in column y based on factor2 column in df2 −
tapply(df2$y,df2$factor2,max)
Output
High Low Medium 38 48 37
- Related Articles
- How to remove row that contains maximum for each column in R data frame?
- How to subset rows of an R data frame based on duplicate values in a particular column?
- How to remove rows from data frame in R that contains NaN?
- How to assign a column value in a data frame based on another column in another R data frame?
- How to subset an R data frame by specifying columns that contains NA?
- How to remove rows that contains all zeros in an R data frame?
- How to remove rows in R data frame that contains a specific number?
- Remove rows from a data frame that exists in another data frame in R.
- How to subset an R data frame based on numerical and categorical column?
- How to subset rows based on criterion of multiple numerical columns in R data frame?
- How to repeat a column of a data frame and join it with another data frame in R by rows?
- How to subset rows of an R data frame using grepl function?
- How to subset rows of data frame without NA using dplyr in R?
- How to convert a data frame column to date that contains integer values in R?
- How to divide data frame rows in R by row maximum?

Advertisements