How to remove NA’s from an R data frame that contains them at different places?


If NA values are placed at different positions in an R data frame then they cannot be easily removed in base R, we would be needing a package for that. The best package to solve this problem is dplyr and we can use summarise_each function of dplyr with na.omit to remove all the NA’s. But if we have more than one column in the data frame then the number of non-NA values must be same in all the columns.

Example

Consider the below data frame:

Live Demo

> x1<-rep(c(NA,2,3),times=c(7,10,3))
> x2<-rep(c(15,NA,24,NA,18),times=c(5,2,5,5,3))
> df1<-data.frame(x1,x2)
> df1

Output

 x1  x2
1 NA 15
2 NA 15
3 NA 15
4 NA 15
5 NA 15
6 NA NA
7 NA NA
8  2 24
9  2 24
10 2 24
11 2 24
12 2 24
13 2 NA
14 2 NA
15 2 NA
16 2 NA
17 2 NA
18 3 18
19 3 18
20 3 18

Loading dplyr package and removing NA’s from df1:

Example

> library(dplyr)
> df1%>%summarise_each(funs(na.omit(.)))

Output

  x1 x2
1  2 15
2  2 15
3  2 15
4  2 15
5  2 15
6  2 24
7  2 24
8  2 24
9  2 24
10 2 24
11 3 18
12 3 18
13 3 18

Let’s have a look at another example:

Example

Live Demo

> y1<-rep(c(545,NA,524,NA,589,NA,537,NA,541,NA),times=c(2,2,2,2,2,2,2,2,2,2))
> y2<-rep(c(NA,2.1,NA,1.7,NA),times=c(4,4,4,6,2))
> df2<-data.frame(y1,y2)
> df2

Output

    y1 y2
1  545 NA
2  545 NA
3   NA NA
4   NA NA
5 524 2.1
6 524 2.1
7   NA 2.1
8   NA 2.1
9  589 NA
10 589 NA
11 NA NA
12 NA NA
13 537 1.7
14 537 1.7
15 NA 1.7
16 NA 1.7
17 541 1.7
18 541 1.7
19 NA NA
20 NA NA

Removing NA’s from df2:

> df2%>%summarise_each(funs(na.omit(.)))

Output

  y1 y2
1  545 2.1
2  545 2.1
3  524 2.1
4  524 2.1
5  589 1.7
6  589 1.7
7  537 1.7
8  537 1.7
9  541 1.7
10 541 1.7

Updated on: 06-Nov-2020

203 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements