How to subset an R data frame based on numerical and categorical column?


Subsetting is one of the commonly used technique which serves many different purposes depending on the objective of analysis. To subset a data frame by excluding a column with the help of dplyr package, we can follow the below steps −

  • Creating a data frame.
  • Subsetting the data frame based on numerical as well as categorical column at the same time with the help of filter function of dplyr package.

Create the data frame

Let's create a data frame as shown below −

 Live Demo

Level<-sample(c("Low","Medium","High"),20,replace=TRUE)
Score<-sample(1:10,20,replace=TRUE)
Dat<-data.frame(Level,Score)
Dat

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

  Level Score
1 High   4
2 Low    7
3 High   1
4 Medium 6
5 Medium 10
6 High   9
7 High 9
8 Low 3
9 Low 3
10 High 4
11 Low 5
12 Medium 3
13 High 8
14 High 10
15 High 5
16 Low 8
17 High 10
18 High 7
19 Low 10
20 Low 6

Subsetting based on numerical and categorical column

Loading dplyr package and subsetting Dat when Score column is greater than 5 and Level is equal to Low −

library(dplyr)
Level<-sample(c("Low","Medium","High"),20,replace=TRUE)
Score<-sample(1:10,20,replace=TRUE)
Dat<-data.frame(Level,Score)
Dat%>%filter(Score>5,Level=="Low")

Output

 Level Score
1 Low 7
2 Low 8
3 Low 10
4 Low 6

Updated on: 13-Aug-2021

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements