How to subset a data.table object in R by specifying columns that contains NA?

R ProgrammingServer Side ProgrammingProgramming

To subset a data.table object by specifying columns that contains NA, we can follow the below steps −

  • First of all, create a data.table object with some columns containing NAs.

  • Then, use is.na along with subset function to subset the data.table object by specifying columns that contains NA.

Example

Create the data.table object

Let’s create a data.table object as shown below −

library(data.table)
x1<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
x2<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x3<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x4<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
DT<-data.table(x1,x2,x3,x4)
DT

Output

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

     x1     x2   x3    x4
1:  -2.34 -0.57  NA    NA
2:  -2.34 -0.57 -0.85 -0.47
3:   NA   -0.57  NA   -0.47
4:  -2.34 -0.57 -0.84  0.69
5:   NA   -0.57  1.82  0.69
6:   1.14 -2.03  1.82  NA
7:  -2.34  NA   -0.84  NA
8:   1.14  0.63 -0.85  NA
9:   NA    NA   -0.84 -0.47
10:  1.14  NA    NA   -0.47
11: -2.34  NA   -0.84  NA
12:  NA    NA   -0.85  NA
13:  1.14  0.63 -0.84  NA
14: -2.34  0.63 -0.84  NA
15: -2.34 -2.03  1.82  NA
16:  NA   -2.03  1.82  NA
17:  NA    NA    NA   -0.47
18:  1.14 -2.03  NA    NA
19:  NA    0.63  1.82  NA
20: -2.34  NA    1.82 -0.47
21:  1.14  0.63  NA    NA
22:  1.14  NA   -0.85 -0.47
23: -2.34 -2.03  NA   -0.47
24:  1.14  0.63  1.82 -0.47
25: -2.34  NA    NA    0.69
    x1     x2     x3   x4

Subset data.table object by specifying columns having NAs

Using is.na along with subset function to subset the data.table object DT by specifying columns x1 and x2 that contains NA as shown below −

library(data.table)
x1<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
x2<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x3<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x4<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
DT<-data.table(x1,x2,x3,x4)
subset(DT,is.na(x1)|is.na(x2))

Output

     x1   x2    x3    x4
1:   NA  -0.57 NA   -0.47
2:   NA  -0.57 1.82  0.69
3:  -2.34 NA  -0.84  NA
4:   NA   NA  -0.84 -0.47
5:   1.14 NA   NA   -0.47
6:  -2.34 NA  -0.84  NA
7:   NA   NA  -0.85  NA
8:   NA  -2.03 1.82  NA
9:   NA   NA   NA   -0.47
10:  NA   0.63 1.82  NA
11: -2.34 NA   1.82 -0.47
12:  1.14 NA  -0.85 -0.47
13: -2.34 NA   NA    0.69
raja
Updated on 15-Nov-2021 09:57:24

Advertisements