- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to subset an R data frame based on numerical and categorical column?
Subsetting is one of the commonly used technique which serves many different purposes depending on the objective of analysis. To subset a data frame by excluding a column with the help of dplyr package, we can follow the below steps −
- Creating a data frame.
- Subsetting the data frame based on numerical as well as categorical column at the same time with the help of filter function of dplyr package.
Create the data frame
Let's create a data frame as shown below −
Level<-sample(c("Low","Medium","High"),20,replace=TRUE) Score<-sample(1:10,20,replace=TRUE) Dat<-data.frame(Level,Score) Dat
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Level Score 1 High 4 2 Low 7 3 High 1 4 Medium 6 5 Medium 10 6 High 9 7 High 9 8 Low 3 9 Low 3 10 High 4 11 Low 5 12 Medium 3 13 High 8 14 High 10 15 High 5 16 Low 8 17 High 10 18 High 7 19 Low 10 20 Low 6
Subsetting based on numerical and categorical column
Loading dplyr package and subsetting Dat when Score column is greater than 5 and Level is equal to Low −
library(dplyr) Level<-sample(c("Low","Medium","High"),20,replace=TRUE) Score<-sample(1:10,20,replace=TRUE) Dat<-data.frame(Level,Score) Dat%>%filter(Score>5,Level=="Low")
Output
Level Score 1 Low 7 2 Low 8 3 Low 10 4 Low 6
Advertisements
To Continue Learning Please Login
Login with Google