- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the frequency of NA values per row in an R data frame?
Since column represent variables, we often find missing values in the columns of a data frame but we may want to find missing values(NA) for cases as well so that we can replace them based on case characteristic instead of the distribution of the variable. In R, we can use rowSums with apply function.
Example
Consider the below data frame −
set.seed(8) x1<-sample(c(NA,1,2),20,replace=TRUE) x2<-sample(c(NA,5,10),20,replace=TRUE) df1<-data.frame(x1,x2) df1
Output
x1 x2 1 2 10 2 1 10 3 2 5 4 2 10 5 1 5 6 2 NA 7 1 5 8 NA 10 9 NA 10 10 1 5 11 2 NA 12 2 5 13 1 NA 14 1 5 15 2 5 16 1 NA 17 NA 5 18 NA 10 19 1 5 20 2 5
Finding the frequency of NA values in rows of the data frame df1 −
Example
df1$NA_frequency<-rowSums(apply(is.na(df1),2,as.numeric)) df1
Output
x1 x2 NA_frequency 1 2 10 0 2 1 10 0 3 2 5 0 4 2 10 0 5 1 5 0 6 2 NA 1 7 1 5 0 8 NA 10 1 9 NA 10 1 10 1 5 0 11 2 NA 1 12 2 5 0 13 1 NA 1 14 1 5 0 15 2 5 0 16 1 NA 1 17 NA 5 1 18 NA 10 1 19 1 5 0 20 2 5 0
Let’s have a look at another example −
Example
y1<-sample(c(NA,1),20,replace=TRUE) y2<-sample(c(NA,400),20,replace=TRUE) y3<-sample(c(NA,35),20,replace=TRUE) y4<-sample(c(NA,127),20,replace=TRUE) df2<-data.frame(y1,y2,y3,y4) df2
Output
y1 y2 y3 y4 1 NA NA 35 127 2 1 NA NA NA 3 1 400 35 127 4 1 NA 35 127 5 1 NA 35 127 6 NA 400 NA NA 7 1 400 NA NA 8 1 NA 35 127 9 1 400 NA 127 10 NA 400 35 127 11 NA NA 35 NA 12 NA NA NA NA 13 NA NA 35 NA 14 1 400 35 NA 15 1 NA 35 NA 16 NA 400 35 NA 17 1 NA 35 NA 18 NA 400 35 127 19 NA 400 NA NA 20 1 NA NA 127
Finding the frequency of NA values in rows of the data frame df2 −
Example
df2$NA_frequency<-rowSums(apply(is.na(df2),2,as.numeric)) df2
Output
y1 y2 y3 y4 NA_frequency 1 NA NA 35 127 2 2 1 NA NA NA 3 3 1 400 35 127 0 4 1 NA 35 127 1 5 1 NA 35 127 1 6 NA 400 NA NA 3 7 1 400 NA NA 2 8 1 NA 35 127 1 9 1 400 NA 127 1 10 NA 400 35 127 1 11 NA NA 35 NA 3 12 NA NA NA NA 4 13 NA NA 35 NA 3 14 1 400 35 NA 1 15 1 NA 35 NA 2 16 NA 400 35 NA 2 17 1 NA 35 NA 2 18 NA 400 35 127 1 19 NA 400 NA NA 3 20 1 NA NA 127 2
Advertisements