How to remove all rows having NA in R?


To remove all rows having NA, we can use na.omit() function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na.omit(df).

That means if we have more than one column in the data frame then rows that contains even one NA will be removed. Check out the below Examples to understand how it works.

Example 1

Consider the below data frame −

x1<-sample(c(NA,5,2),20,replace=TRUE)
x2<-sample(c(NA,10,100),20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

The following dataframe is created

   x1   x2
1   5  10
2  NA  10
3   5 100
4  NA  NA
5   5 100
6   2  NA
7   2  10
8   5 100
9  NA  10
10  2  10
11  5  NA
12 NA 100
13 NA  NA
14  2  NA
15 NA  10
16  5 100
17  2  NA
18 NA  NA
19 NA  10
20 NA  NA

To remove the rows from df1 that contains at least one NA on the above created data frame, add the following code to the above snippet −

x1<-sample(c(NA,5,2),20,replace=TRUE)
x2<-sample(c(NA,10,100),20,replace=TRUE)
df1<-data.frame(x1,x2)
na.omit(df1)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

  x1  x2
1  5  10
3  5 100
5  5 100
7  2  10
8  5 100
10 2  10
16 5 100

Example 2

Following snippet creates a sample data frame −

y1<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y2<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y3<-sample(c(NA,rnorm(2)),20,replace=TRUE)
df2<-data.frame(y1,y2,y3)
df2

The following dataframe is created

         y1           y2           y3
1        NA    -1.779384           NA
2        NA    -1.779384    0.7194928
3 0.5985389     0.389119    1.2007584
4        NA           NA           NA
5 1.2319630    -1.779384    1.2007584
6        NA           NA           NA
7 0.5985389     0.389119    1.2007584
8 0.5985389           NA    1.2007584
9 0.5985389     0.389119    0.7194928
10       NA           NA    0.7194928
11       NA           NA           NA 
12 1.2319630          NA    0.7194928
13 0.5985389          NA    0.7194928
14 1.2319630          NA    0.7194928
15 1.2319630   -1.779384    0.7194928
16 0.5985389   -1.779384    1.2007584
17 0.5985389   -1.779384    0.7194928
18 0.5985389    0.389119    1.2007584
19       NA    -1.779384           NA
20 0.5985389    0.389119    1.2007584

To remove the rows from df2 that contains at least one NA on the above created data frame, add the following code to the above snippet −

y1<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y2<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y3<-sample(c(NA,rnorm(2)),20,replace=TRUE)
df2<-data.frame(y1,y2,y3)
na.omit(df2)

Output

If you execute all the above given snippets as a single program, it generates the following Output −

         y1        y2          y3
3 0.5985389   0.389119  1.2007584
5 1.2319630 -1.779384   1.2007584
7 0.5985389   0.389119  1.2007584
9 0.5985389   0.389119  0.7194928
15 1.2319630 -1.779384  0.7194928
16 0.5985389 -1.779384  1.2007584
17 0.5985389 -1.779384  0.7194928
18 0.5985389  0.389119  1.2007584
20 0.5985389  0.389119  1.2007584

Updated on: 02-Sep-2023

63K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements