What is the difference between na.omit and na.rm in R?


The na.omit performs any calculation by considering the NA values but do not include them in the calculation, on the other hand, na.rm remove the NA values and then perform any calculation. For example, if a vector has one NA and 5 values in total then their sum using na.omit will be calculated by excluding NA and by using na.rm it will be calculated by removing NA.

Consider the below data frame −

Example

 Live Demo

x1<-sample(c(NA,5,2),20,replace=TRUE)
x2<-sample(c(NA,rpois(5,1)),20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

Output

   x1  x2
1  5   2
2  2  0
3  5  0
4  5  0
5  5   0
6  2   0
7  2   0
8  NA  0
9  2   NA
10 2   0
11 2   0
12 NA  0
13 2   2
14 NA 0
15 2   2
16 5   0
17 NA  0
18 2   0
19 NA  NA
20 NA  NA

Finding row means using na.rm and na.omit −

rowMeans(df1,na.rm=TRUE)

[1] 3.5 1.0 2.5 2.5 2.5 1.0 1.0 0.0 2.0 1.0 1.0 0.0 2.0 0.0 2.0 2.5 0.0 1.0 NaN
[20] NaN

rowMeans(na.omit(df1))

1 2 3 4 5 6 7 10 11 13 15 16 18
3.5 1.0 2.5 2.5 2.5 1.0 1.0 1.0 1.0 2.0 2.0 2.5 1.0

Example

 Live Demo

y1<-sample(c(NA,rnorm(5)),20,replace=TRUE)
y2<-sample(c(NA,rnorm(5)),20,replace=TRUE)
df2<-data.frame(y1,y2)
df2

Output

        y1       y2
1   -1.8606647   NA
2   -0.2447069   NA
3   -1.8606647   -0.03428118
4   0.4729139    NA
5   0.4729139    1.37315226
6    NA           -0.03428118
7   -1.8606647   1.37315226
8   -0.2447069   -1.47198479
9   -0.7419227    NA
10  -0.2447069   -1.47198479
11  -0.2447069   -0.22281980
12  -0.7419227   -0.11284788
13  NA          -0.03428118
14  NA          -0.11284788
15  NA           1.37315226
16  NA           -0.03428118
17  -0.4378451  -0.22281980
18  -0.2447069  1.37315226
19  -0.2447069  -0.03428118
20  -0.4378451  -1.47198479

Finding row means using na.rm and na.omit −

rowMeans(df2,na.rm=TRUE)

[1] -1.86066467 -0.24470688 -0.94747292 0.47291388 0.92303307 -0.03428118
[7] -0.24375621 -0.85834583 -0.74192272 -0.85834583 -0.23376334 -0.42738530
[13] -0.03428118 -0.11284788 1.37315226 -0.03428118 -0.33033246 0.56422269
[19] -0.13949403 -0.95491495

rowMeans(na.omit(df2))

       3      5            7         8          10        11        12
-0.9474729 0.9230331 -0.2437562 -0.8583458 -0.8583458 -0.2337633 -0.4273853
      17      18         19        20
-0.3303325 0.5642227 -0.1394940 -0.9549149

Updated on: 06-Feb-2021

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements