- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Subset groups that occur greater than equal to n times in R dataframe.
To subset groups that occur less than n times in R data frame, we can use filter function of dplyr package.
For Example, if we have a data frame called df that contains a grouping column say Group then we can subset groups that occur less than 4 times by using the below mentioned command −
df%%group_by(Group)%%filter(n()=4)
Example 1
Following snippet creates a sample data frame −
Grp<-sample(LETTERS[1:3],20,replace=TRUE) Response<-rpois(20,10) df1<-data.frame(Grp,Response) df1
The following dataframe is created
Grp Response 1 B 7 2 A 12 3 A 9 4 C 11 5 B 9 6 B 7 7 A 5 8 C 5 9 A 6 10 A 12 11 A 4 12 A 11 13 C 13 14 A 17 15 A 12 16 B 9 17 C 4 18 B 11 19 A 7 20 B 10
To load dplyr package and subset df1 based on grouping column Grp that occur greater than equal to 6 times on the above created data frame, add the following code to the above snippet −
Grp<-sample(LETTERS[1:3],20,replace=TRUE) Response<-rpois(20,10) df1<-data.frame(Grp,Response) library(dplyr) df1%%group_by(Grp)%%filter(n()=6) # A tibble: 16 x 2 # Groups: Grp [2]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Grp Response <chr <int 1 B 7 2 A 12 3 A 9 4 B 9 5 B 7 6 A 5 7 A 6 8 A 12 9 A 4 10 A 11 11 A 17 12 A 12 13 B 9 14 B 11 15 A 7 16 B 10
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Price<-sample(20:50,20) df2<-data.frame(Class,Price) df2
The following dataframe is created
Class Price 1 First 45 2 Third 41 3 First 42 4 Second 30 5 First 31 6 Second 28 7 Third 24 8 Third 39 9 Third 44 10 Second 38 11 Third 37 12 Second 49 13 Third 23 14 Third 33 15 First 20 16 Second 36 17 Second 27 18 First 21 19 First 47 20 Third 34
To subset df2 based on grouping column Class that occur greater than equal to 8 times on the above created data frame, add the following code to the above snippet −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Price<-sample(20:50,20) df2<-data.frame(Class,Price) df2%%group_by(Class)%%filter(n()=8) # A tibble: 8 x 2 # Groups: Class [1]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Price <chr <int 1 Third 41 2 Third 24 3 Third 39 4 Third 44 5 Third 37 6 Third 23 7 Third 33 8 Third 34