How to subset an R data frame if numerical column is greater than a certain value for a particular category in grouping column?


Subsetting is one of the commonly used technique which serves many different purposes depending on the objective of analysis. To subset a data frame if numerical column is greater than a certain value for a particular category in grouping column then we need to follow the below steps −

  • Creating a data frame.
  • Subsetting the data frame with the help of filter function of dplyr package.

Create the data frame

Let's create a data frame as shown below −

 Live Demo

x<-rnorm(20,10,0.25)
Gender<-sample(c("Male","Female"),20,replace=TRUE)
df<-data.frame(x,Gender)
df

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

       x    Gender
1 9.401786  Male
2 10.219677 Male
3 10.126467 Male
4 10.260641 Male
5 10.685478 Male
6 10.006628 Male
7 9.912915  Male
8 10.206531 Male
9 10.366212 Female
10 9.746924 Male
11 10.092994 Male
12 10.291531 Male
13 10.398257 Male
14 9.441365  Male
15 9.479788  Male
16 9.670627  Female
17 10.249913 Female
18 9.718280  Male
19 10.007886 Male
20 9.976768 Male

Subsetting the data frame

Loading dplyr package and subsetting df if Gender is Male and x is greater than 10 −

library(dplyr)
x<-rnorm(20,10,0.25)
Gender<-sample(c("Male","Female"),20,replace=TRUE)
df<-data.frame(x,Gender)
df %>% filter(x>10,Gender=="Male")

Output

      x    Gender
1 10.21968 Male
2 10.12647 Male
3 10.26064 Male
4 10.68548 Male
5 10.00663 Male
6 10.20653 Male
7 10.09299 Male
8 10.29153 Male
9 10.39826 Male
10 10.00789 Male

Updated on: 13-Aug-2021

198 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements