How to subset an R data frame if one of the supplied grouping values is found and numerical column value is greater than a certain value?


Subsetting is one of the commonly used technique which serves many different purposes depending on the objective of analysis. To subset a data frame if one of the supplied grouping values is found means that we want to subset if any of the categorical variable values is present in the categorical column and if we want to include a numerical column for a greater than value then we need to follow the below steps −

  • Creating a data frame.
  • Subsetting the data frame if any of the supplied value of categorical variable exist and a numerical column value is greater than a certain value.

Create the data frame

Let's create a data frame as shown below −

 Live Demo

x<-rnorm(20)
Factor<-sample(c("Male","Female","Unknown"),20,replace=TRUE)
df<-data.frame(x,Factor)
df

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

        x     Factor
1 -0.83268524 Female
2 1.66904204  Male
3 0.26228885  Unknown
4 0.42511920  Male
5 0.67910328  Female
6 -0.82505888 Female
7 -0.06084790 Male
8 0.56949099  Unknown
9 0.79874121  Female
10 -0.09112936 Unknown
11 -1.04839717 Male
12 -1.24128634 Unknown
13 1.51186118  Unknown
14 -0.79498005 Unknown
15 0.18607842  Male
16 -0.60505867 Female
17 1.24925658  Male
18 1.14835757  Male
19 -0.24867122 Female
20 0.59079712 Unknown

Subsetting the data frame

Loading dplyr package and subsetting df if Male or Female exists in Factor column and x is greater than 0.5 −

library(dplyr)
x<-rnorm(20)
Factor<-sample(c("Male","Female","Unknown"),20,replace=TRUE)
df<-data.frame(x,Factor)
df %>% filter(x>0.5,Factor=="Male"|Factor=="Female")

Output

      x    Factor
1 1.6690420 Male
2 0.6791033 Female
3 0.7987412 Female
4 1.2492566 Male
5 1.1483576 Male

Updated on: 13-Aug-2021

93 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements