- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Why mean is NaN even if na.rm is set to TRUE using dplyr in R?
If na.rm is set to TRUE using dplyr package then the output for statistical operations returns NaN. To avoid this, we need to exclude na.rm. Follow below steps to understand the difference between the tw −
- First of all, create a data frame.
- Summarise the data frame with na.rm set to TRUE if NA exists in the data frame.
- Summarise the data frame without setting na.rm to TRUE.
Create the data frame
Let's create a data frame as shown below −
Group&li;-rep(c("First","Second","Third"),times=c(3,10,7)) Response&li;-rep(c(NA,3,4,5,7,8),times=c(3,2,5,2,4,4)) df&li;-data.frame(Group,Response) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Group Response 1 First NA 2 First NA 3 First NA 4 Second 3 5 Second 3 6 Second 4 7 Second 4 8 Second 4 9 Second 4 10 Second 4 11 Second 5 12 Second 5 13 Second 7 14 Third 7 15 Third 7 16 Third 7 17 Third 8 18 Third 8 19 Third 8 20 Third 8
Summarising data frame with na.rm set to TRUE
Loading dplyr package and summarise the data frame df with mean of Response per group −
library(dplyr) Group<-rep(c("First","Second","Third"),times=c(3,10,7)) Response<-rep(c(NA,3,4,5,7,8),times=c(3,2,5,2,4,4)) df<-data.frame(Group,Response) df%>%group_by(Group)%>%summarise(mean=mean(Response,na.rm=TRUE))
# A tibble: 3 x 2 Group mean <chr> <dbl> 1 First NaN 2 Second 4.3 3 Third 7.57
Summarising the data frame without setting na.rm to TRUE
Summarise the data frame df with mean of Response per group without setting na.rm to TRUE −
Group<-rep(c("First","Second","Third"),times=c(3,10,7)) Response<-rep(c(NA,3,4,5,7,8),times=c(3,2,5,2,4,4)) df<-data.frame(Group,Response) df%>%group_by(Group)%>%summarise(mean=mean(Response))
# A tibble: 3 x 2 Group mean <chr> <dbl> 1 First NA 2 Second 4.3 3 Third 7.57
- Related Articles
- How to find the mean of row values in an R data frame using dplyr?
- What is the difference between na.omit and na.rm in R?
- If ([] == false) is true, why does ([] || true) result in []? - JavaScript
- How to create relative frequency table using dplyr in R?
- How to set global event_scheduler=ON even if MySQL is restarted?
- How to find the mean of each variable using dplyr by factor variable with ignoring the NA values in R?
- Select different fields in MySQL even if a field is set to null?
- How to convert numeric columns to factor using dplyr package in R?
- Set special characters on values if condition is true in MySQL?
- How to collapse data frame rows in R by summing using dplyr?
- How to check if a variable is NaN in JavaScript?
- How to check if a set is a subset of another set in R?
- How to extract columns of a data frame in R using dplyr package?
- How to subset rows of data frame without NA using dplyr in R?
- Is it true that if I dye my hair they will grow white even faster?

Advertisements