- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Subset groups that occur greater than equal to n times in R dataframe.
To subset groups that occur less than n times in R data frame, we can use filter function of dplyr package.
For Example, if we have a data frame called df that contains a grouping column say Group then we can subset groups that occur less than 4 times by using the below mentioned command −
df%%group_by(Group)%%filter(n()=4)
Example 1
Following snippet creates a sample data frame −
Grp<-sample(LETTERS[1:3],20,replace=TRUE) Response<-rpois(20,10) df1<-data.frame(Grp,Response) df1
The following dataframe is created
Grp Response 1 B 7 2 A 12 3 A 9 4 C 11 5 B 9 6 B 7 7 A 5 8 C 5 9 A 6 10 A 12 11 A 4 12 A 11 13 C 13 14 A 17 15 A 12 16 B 9 17 C 4 18 B 11 19 A 7 20 B 10
To load dplyr package and subset df1 based on grouping column Grp that occur greater than equal to 6 times on the above created data frame, add the following code to the above snippet −
Grp<-sample(LETTERS[1:3],20,replace=TRUE) Response<-rpois(20,10) df1<-data.frame(Grp,Response) library(dplyr) df1%%group_by(Grp)%%filter(n()=6) # A tibble: 16 x 2 # Groups: Grp [2]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Grp Response <chr <int 1 B 7 2 A 12 3 A 9 4 B 9 5 B 7 6 A 5 7 A 6 8 A 12 9 A 4 10 A 11 11 A 17 12 A 12 13 B 9 14 B 11 15 A 7 16 B 10
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Price<-sample(20:50,20) df2<-data.frame(Class,Price) df2
The following dataframe is created
Class Price 1 First 45 2 Third 41 3 First 42 4 Second 30 5 First 31 6 Second 28 7 Third 24 8 Third 39 9 Third 44 10 Second 38 11 Third 37 12 Second 49 13 Third 23 14 Third 33 15 First 20 16 Second 36 17 Second 27 18 First 21 19 First 47 20 Third 34
To subset df2 based on grouping column Class that occur greater than equal to 8 times on the above created data frame, add the following code to the above snippet −
Class<-sample(c("First","Second","Third"),20,replace=TRUE) Price<-sample(20:50,20) df2<-data.frame(Class,Price) df2%%group_by(Class)%%filter(n()=8) # A tibble: 8 x 2 # Groups: Class [1]
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Price <chr <int 1 Third 41 2 Third 24 3 Third 39 4 Third 44 5 Third 37 6 Third 23 7 Third 33 8 Third 34
- Related Articles
- How to remove rows in an R data frame column that has duplicate values greater than or equal to a certain number of times?
- How to create a subset of matrix in R using greater than or less than a certain value of a column?
- How to represent X-axis label of a bar plot with greater than equal to or less than equal to sign using ggplot2 in R?
- How to find numbers in an array that are greater than, less than, or equal to a value in java?
- How to subset columns that has less than four categories in an R data frame?
- How to find the frequency of values greater than or equal to a certain value in R?
- Find element in a sorted array whose frequency is greater than or equal to n/2 in C++.
- How to subset rows of an R data frame if any columns have values greater than a certain value?
- How to subset rows of an R data frame if all columns have values greater than a certain value
- Kth prime number greater than N in C++
- First element greater than or equal to X in prefix sum of N numbers using Binary Lifting in C++
- JavaScript array: Find all elements that appear more than n times
- How to find the count of duplicate rows if they are greater than n in R data frame?
- Python - Consecutive Ranges of K greater than N
- How to subset an R data frame if numerical column is greater than a certain value for a particular category in grouping column?
