How to create a subset of an R data frame having complete cases of a particular column?


If we have missing values in a data frame then all the values cannot be considered complete cases and we might want to extract only values that are complete. We might want extract the complete cases for a particular column only. Therefore, we can use negation of is.na for the column of the data frame that we want to subset.

Consider the below data frame −

Example

 Live Demo

set.seed(123)
x<-sample(c(0,1,NA),20,replace=TRUE)
y<-sample(c(0:2,NA),20,replace=TRUE)
z<-sample(c(0:5,NA),20,replace=TRUE)
a<-sample(c(7,11,13,NA),20,replace=TRUE)
b<-sample(c(51,NA),20,replace=TRUE)
c<-sample(c(rnorm(2,1,0.05),NA),20,replace=TRUE)
df<-data.frame(x,y,z,a,b,c)
df

Output

   x    y    z    a     b    c
1 0    NA    0   13    51   0.9985727
2 NA    2    2    7    NA   NA
3  1    2    2   11    51   NA
4 NA   NA    2   11    NA   0.9985727
5 NA    2    1   NA    51   0.9985727
6  0    2    0   11    51   1.0126659
7  1    2    1   NA    NA   NA
8 NA    2    3   NA    NA   1.0126659
9  1    1    1   NA    NA   1.0126659
10 1    0   NA   11    51   NA
11 NA  NA    0   NA    51   NA
12 1   NA    3   13    NA   1.0126659
13 NA   2    5   13    51   0.9985727
14 1   NA    0    7    NA   NA
15 0   0     3   11    51   0.9985727
16 NA  1     1    7    51   0.9985727
17 0   NA    0   11    NA   0.9985727
18 0   0     5   13    51   1.0126659
19 0   1    NA   11    51   1.0126659
20 NA  0     2    7    NA   1.0126659

Subsetting with complete cases of column x −

Example

df[!is.na(df["x"]),]

Output

  x y z a b c
1  NA 0 13 NA 1.013801
6 1 2 NA 11 NA NA 7
1 1 2 7 NA 1.013801 8
1 0 3 13 51 1.061420
10 0 2 NA 13 NA NA
11 1 NA 4 11 51 NA
12 1 1 2 NA 51 1.013801
13 0 0 5 13 NA 1.013801
14 1 2 0 NA 51 NA
16 0 0 4 11 51 1.061420
19 0 NA NA NA 51 1.013801
20 0 1 4 11 NA 1.013801

Subsetting with complete cases of column y −

Example

df[!is.na(df["y"]),]

Output

x y z a b c
1 NA 0 4 11 NA 1.013801
2 NA 0 NA 7 51 NA
3 NA 2 0 7 51 1.061420
5 NA 1 1 7 51 1.013801
6 1 2 NA 11 NA NA
7 1 1 2 7 NA 1.013801
8 1 0 3 13 51 1.061420
9 NA 1 4 7 NA NA
10 0 2 NA 13 NA NA
 12 1 1 2 NA 51 1.013801
13 0 0 5 13 NA 1.013801
14 1 2 0 NA 51 NA
15 NA 2 1 NA 51 1.061420
16 0 0 4 11 51 1.061420
18 NA 2 3 13 NA 1.013801
20 0 1 4 11 NA 1.013801

Subsetting with complete cases of column z −

Example

df[!is.na(df["z"]),]

Output

  x   y  z  a   b    c
1 NA  0  4  11 NA 1.013801
 3 NA  2  0  7 51 1.061420
4 1  NA  0  13 NA 1.013801
5 NA  1  1 7 51 1.013801
7 1  1  2 7 NA 1.013801
8 1  0  3 13 51 1.061420
9 NA 1  4 7 NA   NA
11 1 NA 4 11 51   NA
12 1 1 2 NA 51   1.013801
 13 0 0 5 13 NA  1.013801
14 1 2 0 NA 51     NA
15 NA 2 1 NA 51    1.061420
16 0 0 4 11 51    1.061420
17 NA NA 4 11 NA NA
18 NA 2 3 13 NA 1.013801
20 0 1 4 11 NA 1.013801

Subsetting with complete cases of column a −

Example

df[!is.na(df["a"]),]

Output

x y z a b c 
1 NA 0 4 11 NA 1.013801 
2 NA 0 NA 7 51 NA 
3 NA 2 0 7 51 1.061420 
4 1 NA 0 13 NA 1.013801 
5 NA 1 1 7 51 1.013801 
6 1 2 NA 11 NA NA 
7 1 1 2 7 NA 1.013801 
8 1 0 3 13 51 1.061420 
9 NA 1 4 7 NA NA 
10 0 2 NA 13 NA NA 
11 1 NA 4 11 51 NA 
13 0 0 5 13 NA 1.013801 
16 0 0 4 11 51 1.061420 
17 NA NA 4 11 NA NA 
18 NA 2 3 13 NA 1.013801 
20 0 1 4 11 NA 1.013801

Subsetting with complete cases of column b −

Example

df[!is.na(df["b"]),]

Output

x y z a b c
2 NA 0 NA 7 51 NA
3 NA 2 0 7 51 1.061420
5 NA 1 1 7 51 1.013801
8 1 0 3 13 51 1.061420
11 1 NA 4 11 51 NA
12 1 1 2 NA 51 1.013801
14 1 2 0 NA 51 NA
15 NA 2 1 NA 51 1.061420
16 0 0 4 11 51 1.061420
19 0 NA NA NA 51 1.013801

Subsetting with complete cases of column c −

Example

df[!is.na(df["c"]),]

Output

x y z a b c
1 NA 0 4 11 NA 1.013801
3 NA 2 0 7 51 1.061420
4 1 NA 0 13 NA 1.013801
5 NA 1 1 7 51 1.013801
7 1 1 2 7 NA 1.013801
8 1 0 3 13 51 1.061420
12 1 1 2 NA 51 1.013801
13 0 0 5 13 NA 1.013801
15 NA 2 1 NA 51 1.061420
16 0 0 4 11 51 1.061420
18 NA 2 3 13 NA 1.013801
19 0 NA NA NA 51 1.013801
20 0 1 4 11 NA 1.013801

Updated on: 14-Oct-2020

204 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements