How to find the number of groupwise missing values in an R data frame?


In data science, we often face the problem of missing values and we need to define a way to replace them with an appropriate value or we can complete remove them. If we want to replace the missing then we also need to know how many missing values are there. Therefore, if we have a data frame with grouping column then finding the number of groupwise missing values can be done with aggregate function as shown in the below examples.

Example1

Consider the below data frame −

Live Demo

> Group<-sample(c("A","B"),20,replace=TRUE)
> x<-sample(c(NA,2),20,replace=TRUE)
> df1<-data.frame(Group,x)
> df1

Output

   Group  x
1      A  2
2      A NA
3      A NA
4      B  2
5      B  2
6      B NA
7      A  2
8      B NA
9      A  2
10     B NA
11     A NA
12     A  2
13     B  2
14     B  2
15     B NA
16     A NA
17     A  2
18     B  2
19     B NA
20     A NA

Finding groupwise missing values in df1 −

> aggregate(x~Group,data=df1, function(x) {sum(is.na(x))},na.action=NULL)

Output

  Group x
1     A 5
2     B 5

Example2

Live Demo

> Class<-sample(c("First","Second"),20,replace=TRUE)
> Score<-sample(c(NA,10,15),20,replace=TRUE)
> df2<-data.frame(Class,Score)
> df2

Output

    Class Score
1  Second    15
2   First    15
3  Second    10
4   First    10
5   First    15
6  Second    10
7   First    15
8  Second    NA
9  Second    15
10  First    15
11 Second    NA
12 Second    NA
13 Second    NA
14 Second    10
15 Second    NA
16  First    10
17  First    NA
18  First    15
19  First    10
20 Second    NA

Finding groupwise missing values in df2 −

> aggregate(Score~Class,data=df2, function(x) {sum(is.na(x))},na.action=NULL)

Output

   Class Score
1  First     1
2 Second     6

Updated on: 05-Mar-2021

186 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements