- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the percentage of missing values in each column of an R data frame?
To find the percentage of missing values in each column of an R data frame, we can use colMeans function with is.na function. This will find the mean of missing values in each column. After that we can multiply the output with 100 to get the percentage.
Check out the below given examples to understand how it can be done.
Example 1
Following snippet creates a sample data frame −
x1<-sample(c(NA,1,2),20,replace=TRUE) x2<-sample(c(NA,5),20,replace=TRUE) x3<-sample(c(NA,10,12),20,replace=TRUE) df1<-data.frame(x1,x2,x3) df1
Output
The following dataframe is created −
x1 x2 x3 1 NA NA 12 2 2 5 10 3 2 5 12 4 1 5 12 5 1 5 NA 6 NA 5 10 7 1 NA 10 8 NA 5 10 9 2 NA 12 10 2 NA NA 11 NA NA NA 12 NA 5 12 13 NA NA 10 14 1 NA NA 15 2 NA 12 16 1 5 NA 17 NA 5 10 18 2 5 10 19 NA 5 12 20 NA 5 12
To find the percentage of NA in each column of df1, add the following code to the above snippet −
x1<-sample(c(NA,1,2),20,replace=TRUE) x2<-sample(c(NA,5),20,replace=TRUE) x3<-sample(c(NA,10,12),20,replace=TRUE) df1<-data.frame(x1,x2,x3) (colMeans(is.na(df1)))*100
Output
If you execute all the above given codes as a single program, it generates the following output −
x1 x2 x3 45 40 25
Example 2
Following snippet creates a sample data frame −
y1<-sample(c(NA,rnorm(2)),20,replace=TRUE) y2<-sample(c(NA,rnorm(2)),20,replace=TRUE) df2<-data.frame(y1,y2) df2
Output
The following dataframe is created −
y1 y2 1 -1.407410 NA 2 -1.771819 NA 3 -1.771819 NA 4 NA -0.05582021 5 NA NA 6 -1.407410 -0.05582021 7 NA NA 8 NA -0.05582021 9 -1.407410 1.19697209 10 -1.407410 NA 11 -1.771819 -0.05582021 12 NA NA 13 -1.771819 NA 14 -1.771819 -0.05582021 15 NA -0.05582021 16 -1.407410 1.19697209 17 -1.771819 -0.05582021 18 NA NA 19 -1.407410 -0.05582021 20 -1.407410 1.19697209
To find the percentage of NA in each column of df2, add the following code to the above snippet −
y1<-sample(c(NA,rnorm(2)),20,replace=TRUE) y2<-sample(c(NA,rnorm(2)),20,replace=TRUE) df2<-data.frame(y1,y2) (colMeans(is.na(df2)))*100
Output
If you execute all the above given codes as a single program, it generates the following output −
y1 y2 35 45
Example 3
Following snippet creates a sample data frame −
z1<-sample(c(NA,round(runif(2,1,5),2)),20,replace=TRUE) z2<-sample(c(NA,round(runif(2,2,10),2)),20,replace=TRUE) z3<-sample(c(NA,round(runif(2,5,10),2)),20,replace=TRUE) df3<-data.frame(z1,z2,z3) df3
Output
The following dataframe is created −
z1 z2 z3 1 1.69 2.76 NA 2 NA 7.59 NA 3 NA 2.76 9.13 4 4.24 NA 9.13 5 1.69 NA 9.13 6 NA 2.76 8.85 7 NA 7.59 NA 8 NA NA 9.13 9 NA 7.59 NA 10 1.69 2.76 NA 11 4.24 7.59 8.85 12 1.69 NA 8.85 13 4.24 NA NA 14 NA NA 8.85 15 4.24 7.59 9.13 16 4.24 7.59 NA 17 1.69 2.76 9.13 18 NA NA 9.13 19 4.24 2.76 8.85 20 4.24 NA NA
To find the percentage of NA in each column of df3, add the following code to the above snippet −
z1<-sample(c(NA,round(runif(2,1,5),2)),20,replace=TRUE) z2<-sample(c(NA,round(runif(2,2,10),2)),20,replace=TRUE) z3<-sample(c(NA,round(runif(2,5,10),2)),20,replace=TRUE) df3<-data.frame(z1,z2,z3) (colMeans(is.na(df3)))*100
Output
If you execute all the above given codes as a single program, it generates the following output −
z1 z2 z3 40 40 40
- Related Articles
- How to find the percentage of missing values in an R data frame?
- How to find the percentage of each category in an R data frame column?
- Find the frequency of unique values and missing values for each column in an R data frame.
- How to find the sum of non-missing values in an R data frame column?
- Find the number of non-missing values in each column by group in an R data frame.\n
- Find the number of non-missing values in each group of an R data frame.
- How to find the percentage of zeros in each column of a data frame in R?
- How to find the number of groupwise missing values in an R data frame?
- How to replace missing values with median in an R data frame column?
- How to replace missing values in a column with corresponding values in other column of an R data frame?
- How to find the sum of column values of an R data frame?
- Find the frequency of unique values for each column in an R data frame.
- How to find the percentage of values that lie within a range in R data frame column?
- How to find the unique values in a column of an R data frame?
- How to find the sum of squared values of an R data frame column?
