- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to convert a column with missing values to binary with 0 for missing values in R?
To convert a column with missing values to binary with 0 for missing values, we can use as.integer function with complete.cases for the data frame column. For example, if we have a data frame called df that contains a column x which has some missing values then the column x can be converted to binary with 0 for missing values by using the command −
as.integer(complete.cases(df$x))
Example1
Consider the below data frame −
> x1<-sample(c(NA,2),20,replace=TRUE) > y1<-rpois(20,5) > df1<-data.frame(x1,y1) > df1
Output
x1 y1 1 NA 2 2 2 5 3 2 10 4 2 2 5 2 4 6 NA 7 7 NA 5 8 NA 6 9 2 5 10 2 7 11 2 3 12 2 2 13 NA 2 14 2 5 15 NA 6 16 NA 5 17 NA 5 18 2 5 19 2 4 20 2 10
Converting column x1 to binary with 0 for missing values −
> df1$x1<-as.integer(complete.cases(df1$x1)) > df1
Output
x1 y1 1 0 2 2 1 5 3 1 10 4 1 2 5 1 4 6 0 7 7 0 5 8 0 6 9 1 5 10 1 7 11 1 3 12 1 2 13 0 2 14 1 5 15 0 6 16 0 5 17 0 5 18 1 5 19 1 4 20 1 10
Example2
> x2<-sample(c(NA,rnorm(2)),20,replace=TRUE) > y2<-rnorm(20) > df2<-data.frame(x2,y2) > df2
Output
x2 y2 1 0.226603 0.25344032 2 0.226603 1.29778682 3 0.545375 -0.66657868 4 NA -1.69272917 5 NA 0.82631979 6 0.545375 -0.12555785 7 0.545375 0.06530913 8 0.545375 0.28359006 9 NA -0.36156762 10 0.226603 0.50943088 11 0.545375 -0.03497627 12 0.545375 1.04488383 13 0.226603 0.55466746 14 0.545375 2.13492023 15 NA 1.18845284 16 0.545375 -0.32171987 17 0.545375 -0.04996223 18 0.226603 -0.41604823 19 0.226603 -1.11003170 20 0.545375 0.34924872
Converting column x2 to binary with 0 for missing values −
> df2$x2<-as.integer(complete.cases(df2$x2)) > df2
Output
x2 y2 1 1 0.25344032 2 1 1.29778682 3 1 -0.66657868 4 0 -1.69272917 5 0 0.82631979 6 1 -0.12555785 7 1 0.06530913 8 1 0.28359006 9 0 -0.36156762 10 1 0.50943088 11 1 -0.03497627 12 1 1.04488383 13 1 0.55466746 14 1 2.13492023 15 0 1.18845284 16 1 -0.32171987 17 1 -0.04996223 18 1 -0.41604823 19 1 -1.11003170 20 1 0.34924872