How to remove rows in a data.table object with NA's in R?


If a row contains missing values then their sum will not finite, therefore, we can use is.finite function with the data.table object to remove the rows with NA’s. For example, if we have a data.table object called DT that contains some rows with NA’s then the removal of those rows can be done by using DT[is.finite(rowSums(DT))].

Example1

Loading data.table package and creating a data.table object −

> library(data.table)
> x1<-sample(c(1,NA),20,replace=TRUE)
> x2<-rpois(20,5)
> DT1<-data.table(x1,x2)
> DT1

Output

   x1  x2
1:  1   2
2:  NA  4
3:  1   2
4:  NA  5
5:  1   6
6:  1   8
7:  NA  3
8:  1   5
9:  1   6
10: 1   6
11: NA  5
12: 1   4
13: NA  8
14: NA  5
15: 1   4
16: NA  5
17: NA  4
18: 1   5
19: 1   8
20: 1   5

Removing rows from DT1 that contain NA’s −

> DT1[is.finite(rowSums(DT1))]

Output

   x1 x2
1:  1  2
2:  1  2
3:  1  6
4:  1  8
5:  1  5
6:  1  6
7:  1  6
8:  1  4
9:  1  4
10: 1  5
11: 1  8
12: 1  5

Example2

> y1<-sample(c(5,NA),20,replace=TRUE)
> y2<-rnorm(20)
> DT2<-data.table(y1,y2)
> DT2

Output

   y1      y2
1: NA  -0.1011854033
2: NA   0.0852494741
3: 5   -3.0690178687
4: NA  -0.2443067757
5: NA  -1.7802490517
6: 5   -0.8969211846
7: 5   -0.0414789991
8: 5    1.7043093000
9: 5   -0.2734151106
10: 5   0.5297258605
11: 5  -0.3614407993
12: 5   0.4282377599
13: 5  -0.9532251956
14: 5  -2.6958281110
15: 5  -0.5990272270
16: 5   0.3634742009
17: NA  0.2436549088
18: NA -0.0004956819
19: 5   0.8576350551
20: 5  -0.5816740589

Removing rows from DT2 that contain NA’s −

> DT2[is.finite(rowSums(DT2))]

Output

   y1   y2
1: 5  -3.0690179
2: 5  -0.8969212
3: 5  -0.0414790
4: 5   1.7043093
5: 5  -0.2734151
6: 5   0.5297259
7: 5  -0.3614408
8: 5   0.4282378
9: 5  -0.9532252
10: 5 -2.6958281
11: 5 -0.5990272
12: 5  0.3634742
13: 5  0.8576351
14: 5 -0.5816741

Updated on: 04-Mar-2021

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements