
- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to remove rows that contains coded missing value for all columns in an R data frame?
Sometimes missing values are coded and when we perform analysis without replacing those missing values the result of the analysis becomes a little difficult to interpret, especially it is difficult to understand by first time readers.
Therefore, we might want to remove rows that contains coded missing values. For this purpose, we can replace the coded missing values with NA and then replace the rows with NA as shown in the below given examples.
Example 1
Following snippet creates a data frame, if missing values are coded as 1 −
x1<-rpois(20,1) x2<-rpois(20,1) df1<-data.frame(x1,x2) df1
The following dataframe is created −
x1 x2 1 1 0 2 1 2 3 1 3 4 1 1 5 0 1 6 0 1 7 1 0 8 0 1 9 2 1 10 1 2 11 0 3 12 1 0 13 1 2 14 2 2 15 0 0 16 2 3 17 1 1 18 2 0 19 0 0 20 1 1
To remove rows that contains coded missing value for all columns in an R data frame, add the following code to the above snippet −
x1<-rpois(20,1) x2<-rpois(20,1) df1<-data.frame(x1,x2) df1[df1==1]<-NA df1
Output
If you execute all the above given snippets as a single program, it generates the following output: −
x1 x2 1 NA 0 2 NA 2 3 NA 3 4 NA NA 5 0 NA 6 0 NA 7 NA 0 8 0 NA 9 2 NA 10 NA 2 11 0 3 12 NA 0 13 NA 2 14 2 2 15 0 0 16 2 3 17 NA NA 18 2 0 19 0 0 20 NA NA
To remove rows that contains coded missing value for all columns in an R data frame, add the following code to the above snippet −
df1[rowSums(is.na(df1))<ncol(df1),]
Output
If you execute all the above given snippets as a single program, it generates the following output: −
x1 x2 1 NA 0 2 NA 2 3 NA 3 5 0 NA 6 0 NA 7 NA 0 8 0 NA 9 2 NA 10 NA 2 11 0 3 12 NA 0 13 NA 2 14 2 2 15 0 0 16 2 3 18 2 0 19 0 0
Example 2
Following snippet creates a data frame, if missing values are coded as 99 −
y1<-sample(c(1,99),20,replace=TRUE) y2<-sample(c(5,99),20,replace=TRUE) df2<-data.frame(y1,y2) df2
The following dataframe is created −
y1 y2 1 99 5 2 99 5 3 99 5 4 1 99 5 1 99 6 1 5 7 1 99 8 99 99 9 99 99 10 99 99 11 99 99 12 99 5 13 1 99 14 99 5 15 99 5 16 99 99 17 99 5 18 99 99 19 99 99 20 99 5
To remove rows that contains coded missing value for all columns in an R data frame, add the following code to the above snippet −
y1<-sample(c(1,99),20,replace=TRUE) y2<-sample(c(5,99),20,replace=TRUE) df2<-data.frame(y1,y2) df2[df2==99]<-NA df2
Output
If you execute all the above given snippets as a single program, it generates the following output: −
y1 y2 1 NA 5 2 NA 5 3 NA 5 4 1 NA 5 1 NA 6 1 5 7 1 NA 8 NA NA 9 NA NA 10 NA NA 11 NA NA 12 NA 5 13 1 NA 14 NA 5 15 NA 5 16 NA NA 17 NA 5 18 NA NA 19 NA NA 20 NA 5
To remove rows that contains coded missing value for all columns in an R data frame, add the following code to the above snippet −
df2[rowSums(is.na(df2))<ncol(df2),]
Output
If you execute all the above given snippets as a single program, it generates the following output: −
y1 y2 1 NA 5 2 NA 5 3 NA 5 4 1 NA 5 1 NA 6 1 5 7 1 NA 12 NA 5 13 1 NA 14 NA 5 15 NA 5 17 NA 5 20 NA 5
- Related Questions & Answers
- How to remove rows that contains all zeros in an R data frame?
- How to remove rows that contains NA values in certain columns of an R data frame?
- How to remove rows from data frame in R that contains NaN?
- How to remove rows in R data frame that contains a specific number?
- How to remove rows from an R data frame that contains at least one NaN?
- How to visualize a data frame that contains missing values in R?
- How to remove rows containing missing value based on a particular column in an R data frame?
- How to subset an R data frame by specifying columns that contains NA?
- How to remove row that contains maximum for each column in R data frame?
- How to remove rows for categorical columns that has three or less combination of duplicates in an R data frame?
- How to find rows in an R data frame that do not have missing values?
- How to find the correlation matrix for a data frame that contains missing values in R?
- How to remove a column from a data frame that contains same value in R?
- How to remove empty rows from an R data frame?
- How to remove duplicate rows in an R data frame if exists in two columns?