How to remove multiple rows from an R data frame using dplyr package?


Sometimes we get unnecessary information in our data set that needs to be removed, this information could be a single case, multiple cases, whole variable or any other thing that is not helpful in achieving our analytical objective, hence we want to remove it. If we want to remove such type of rows from an R data frame with the help of dplyr package then anti_join function can be used.

Example

Consider the below data frame:

Live Demo

> set.seed(2514)
> x1<-rnorm(20,5)
> x2<-rnorm(20,5,0.05)
> df1<-data.frame(x1,x2)
> df1

Output

     x1      x2
1 5.567262 4.998607
2 5.343063 4.931962
3 2.211267 5.034461
4 5.092191 5.075641
5 3.883282 4.997900
6 5.950218 5.038626
7 4.903268 5.010087
8 7.462286 4.974513
9 5.056762 5.097812
10 6.031768 5.002989
11 3.814416 4.990552
12 3.359167 4.891964
13 5.304671 4.950883
14 4.768564 4.953290
15 3.842797 4.950219
16 5.270018 4.995953
17 6.344269 5.008545
18 5.366249 4.905290
19 5.547608 5.098554
20 5.266844 5.003416

Loading dplyr package:

> library(dplyr)

Removing rows 1 to 5 from df1:

> anti_join(df1,df1[1:5,])
Joining, by = c("x1", "x2")
     x1       x2
1 5.950218 5.038626
2 4.903268 5.010087
3 7.462286 4.974513
4 5.056762 5.097812
5 6.031768 5.002989
6 3.814416 4.990552
7 3.359167 4.891964
8 5.304671 4.950883
9 4.768564 4.953290
10 3.842797 4.950219
11 5.270018 4.995953
12 6.344269 5.008545
13 5.366249 4.905290
14 5.547608 5.098554
15 5.266844 5.003416

Removing rows 11 to 18 from df1:

> anti_join(df1,df1[11:18,])
Joining, by = c("x1", "x2")
     x1       x2
1 5.567262 4.998607
2 5.343063 4.931962
3 2.211267 5.034461
4 5.092191 5.075641
5 3.883282 4.997900
6 5.950218 5.038626
7 4.903268 5.010087
8 7.462286 4.974513
9 5.056762 5.097812
10 6.031768 5.002989
11 5.547608 5.098554
12 5.266844 5.003416

Removing rows 6 to 12 from df1:

> anti_join(df1,df1[6:12,])
Joining, by = c("x1", "x2")
     x1       x2
1 5.567262 4.998607
2 5.343063 4.931962
3 2.211267 5.034461
4 5.092191 5.075641
5 3.883282 4.997900
6 5.304671 4.950883
7 4.768564 4.953290
8 3.842797 4.950219
9 5.270018 4.995953
10 6.344269 5.008545
11 5.366249 4.905290
12 5.547608 5.098554
13 5.266844 5.003416

Removing rows 15 to 20 from df1:

> anti_join(df1,df1[15:20,])
Joining, by = c("x1", "x2")
     x1      x2
1 5.567262 4.998607
2 5.343063 4.931962
3 2.211267 5.034461
4 5.092191 5.075641
5 3.883282 4.997900
6 5.950218 5.038626
7 4.903268 5.010087
8 7.462286 4.974513
9 5.056762 5.097812
10 6.031768 5.002989
11 3.814416 4.990552
12 3.359167 4.891964
13 5.304671 4.950883
14 4.768564 4.953290

Removing rows 5 to 18 from df1:

> anti_join(df1,df1[5:18,])
Joining, by = c("x1", "x2")
     x1       x2
1 5.567262 4.998607
2 5.343063 4.931962
3 2.211267 5.034461
4 5.092191 5.075641
5 5.547608 5.098554
6 5.266844 5.003416

Removing rows 11 to 20 from df1:

> anti_join(df1,df1[11:20,])
Joining, by = c("x1", "x2")
    x1        x2
1 5.567262 4.998607
2 5.343063 4.931962
3 2.211267 5.034461
4 5.092191 5.075641
5 3.883282 4.997900
6 5.950218 5.038626
7 4.903268 5.010087
8 7.462286 4.974513
9 5.056762 5.097812
10 6.031768 5.002989

Removing rows 1 to 10 from df1:

> anti_join(df1,df1[1:10,])
Joining, by = c("x1", "x2")
      x1      x2
1 3.814416 4.990552
2 3.359167 4.891964
3 5.304671 4.950883
4 4.768564 4.953290
5 3.842797 4.950219
6 5.270018 4.995953
7 6.344269 5.008545
8 5.366249 4.905290
9 5.547608 5.098554
10 5.266844 5.003416

Removing rows 2 to 11 from df1:

> anti_join(df1,df1[2:11,])
Joining, by = c("x1", "x2")
     x1       x2
1 5.567262 4.998607
2 3.359167 4.891964
3 5.304671 4.950883
4 4.768564 4.953290
5 3.842797 4.950219
6 5.270018 4.995953
7 6.344269 5.008545
8 5.366249 4.905290
9 5.547608 5.098554
10 5.266844 5.003416

Updated on: 06-Nov-2020

347 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements