# Subset groups that occur greater than equal to n times in R dataframe.

To subset groups that occur less than n times in R data frame, we can use filter function of dplyr package.

For Example, if we have a data frame called df that contains a grouping column say Group then we can subset groups that occur less than 4 times by using the below mentioned command −

df%%group_by(Group)%%filter(n()=4)

## Example 1

Following snippet creates a sample data frame −

Grp<-sample(LETTERS[1:3],20,replace=TRUE) Response<-rpois(20,10) df1<-data.frame(Grp,Response) df1

The following dataframe is created

Grp Response 1 B 7 2 A 12 3 A 9 4 C 11 5 B 9 6 B 7 7 A 5 8 C 5 9 A 6 10 A 12 11 A 4 12 A 11 13 C 13 14 A 17 15 A 12 16 B 9 17 C 4 18 B 11 19 A 7 20 B 10

To load dplyr package and subset df1 based on grouping column Grp that occur greater than equal to 6 times on the above created data frame, add the following code to the above snippet −

Grp<-sample(LETTERS[1:3],20,replace=TRUE) Response<-rpois(20,10) df1<-data.frame(Grp,Response) library(dplyr) df1%%group_by(Grp)%%filter(n()=6) # A tibble: 16 x 2 # Groups: Grp [2]

## Output

If you execute all the above given snippets as a single program, it generates the following Output −

Grp Response <chr <int 1 B 7 2 A 12 3 A 9 4 B 9 5 B 7 6 A 5 7 A 6 8 A 12 9 A 4 10 A 11 11 A 17 12 A 12 13 B 9 14 B 11 15 A 7 16 B 10

## Example 2

Following snippet creates a sample data frame −

Class<-sample(c("First","Second","Third"),20,replace=TRUE) Price<-sample(20:50,20) df2<-data.frame(Class,Price) df2

The following dataframe is created

Class Price 1 First 45 2 Third 41 3 First 42 4 Second 30 5 First 31 6 Second 28 7 Third 24 8 Third 39 9 Third 44 10 Second 38 11 Third 37 12 Second 49 13 Third 23 14 Third 33 15 First 20 16 Second 36 17 Second 27 18 First 21 19 First 47 20 Third 34

To subset df2 based on grouping column Class that occur greater than equal to 8 times on the above created data frame, add the following code to the above snippet −

Class<-sample(c("First","Second","Third"),20,replace=TRUE) Price<-sample(20:50,20) df2<-data.frame(Class,Price) df2%%group_by(Class)%%filter(n()=8) # A tibble: 8 x 2 # Groups: Class [1]

## Output

If you execute all the above given snippets as a single program, it generates the following Output −

Class Price <chr <int 1 Third 41 2 Third 24 3 Third 39 4 Third 44 5 Third 37 6 Third 23 7 Third 33 8 Third 34

