- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to subset an R data frame if one of the supplied grouping values is found and numerical column value is greater than a certain value?
Subsetting is one of the commonly used technique which serves many different purposes depending on the objective of analysis. To subset a data frame if one of the supplied grouping values is found means that we want to subset if any of the categorical variable values is present in the categorical column and if we want to include a numerical column for a greater than value then we need to follow the below steps −
- Creating a data frame.
- Subsetting the data frame if any of the supplied value of categorical variable exist and a numerical column value is greater than a certain value.
Create the data frame
Let's create a data frame as shown below −
x<-rnorm(20) Factor<-sample(c("Male","Female","Unknown"),20,replace=TRUE) df<-data.frame(x,Factor) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
x Factor 1 -0.83268524 Female 2 1.66904204 Male 3 0.26228885 Unknown 4 0.42511920 Male 5 0.67910328 Female 6 -0.82505888 Female 7 -0.06084790 Male 8 0.56949099 Unknown 9 0.79874121 Female 10 -0.09112936 Unknown 11 -1.04839717 Male 12 -1.24128634 Unknown 13 1.51186118 Unknown 14 -0.79498005 Unknown 15 0.18607842 Male 16 -0.60505867 Female 17 1.24925658 Male 18 1.14835757 Male 19 -0.24867122 Female 20 0.59079712 Unknown
Subsetting the data frame
Loading dplyr package and subsetting df if Male or Female exists in Factor column and x is greater than 0.5 −
library(dplyr) x<-rnorm(20) Factor<-sample(c("Male","Female","Unknown"),20,replace=TRUE) df<-data.frame(x,Factor) df %>% filter(x>0.5,Factor=="Male"|Factor=="Female")
Output
x Factor 1 1.6690420 Male 2 0.6791033 Female 3 0.7987412 Female 4 1.2492566 Male 5 1.1483576 Male
Advertisements