How to remove first few rows from each group in R?

R ProgrammingServer Side ProgrammingProgramming

To remove first few rows from each group in R, we can use slice function of dplyr package after grouping with group_by function.

For example, if we have a data frame called df that contains a grouping column say Grp then we remove first 2 rows from each group by using the command given below −

df%>%group_by(Grp)%>%slice(3:n())

Example 1

Following snippet creates a sample data frame −

Group<-sample(c("India","China","UK"),20,replace=TRUE)
Int_Score<-sample(20:50,20)
df1<-data.frame(Group,Int_Score)
df1

Output

The following dataframe is created −

  Group Int_Score
1  UK     25
2  UK     28
3  India  38
4  China  49
5  China  33
6  India  42
7  India  21
8  UK     46
9  India  20
10 India  43
11 China  37
12 UK     40
13 India  32
14 China  26
15 India  41
16 UK     24
17 UK     48
18 UK     39
19 India  35
20 India  22

To load dplyr package and remove first two rows from each group in df1, add the following code to the above snippet −

library(dplyr)
df1%>%group_by(Group)%>%slice(3:n())
# A tibble: 14 x 2
# Groups: Group [3]

Output

If you execute all the above given codes as a single program, it generates the following output −

   Group  Int_Score
   <chr>  <int>
1  China   37
2  China   26
3  India   21
4  India   20
5  India   43
6  India   32
7  India   41
8  India   35
9  India   22
10 UK      46
11 UK      40
12 UK      24
13 UK      48
14 UK      39

Example 2

Following snippet creates a sample data frame −

Class<-sample(c("I","II","III"),20,replace=TRUE)
Response<-rpois(20,5)
df2<-data.frame(Class,Response)
df2

Output

The following dataframe is created −

 Class Response
1   II   1
2    I   7
3  III  10
4    I   3
5  III   3
6   II   2
7    I   6
8  III   3
9   II   5
10   I   6
11   I   4
12 III   3
13  II   4
14 III   1
15 III   4
16 III   8
17 III   8
18 III   4
19 III   4
20   I   6

To remove first two rows from each group in df2, add the following code to the above snippet −

df2%>%group_by(Class)%>%slice(3:n())
# A tibble: 14 x 2
# Groups: Class [3]

Output

If you execute all the above given codes as a single program, it generates the following output −

  Class Response
  <chr> <int>
1    I   6
2    I   6
3    I   4
4    I   6
5   II   5
6   II   4
7  III   3
8  III   3
9  III   1
10 III   4
11 III   8
12 III   8
13 III   4
14 III   4
raja
Published on 06-Nov-2021 06:48:10
Advertisements