- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to combine columns by excluding missing values in R?
If we have a data set that contains missing values at alternate places for each column then we might want to combine the columns by excluding those missing values, this will reduce the data set and the analysis is likely to become easier.
For this purpose, we can use na.exclude function along with apply function as shown in the below given examples.
Example 1
Following snippet creates a sample data frame −
x1<-rep(c(NA,2,10),times=c(5,10,5)) x2<-rep(c(1,3,5,NA),times=c(5,5,5,5)) x3<-rep(c(10,NA,3),times=c(5,10,5)) df1<-data.frame(x1,x2,x3) df1
The following dataframe is created −
x1 x2 x3 1 NA 1 10 2 NA 1 10 3 NA 1 10 4 NA 1 10 5 NA 1 10 6 2 3 NA 7 2 3 NA 8 2 3 NA 9 2 3 NA 10 2 3 NA 11 2 5 NA 12 2 5 NA 13 2 5 NA 14 2 5 NA 15 2 5 NA 16 10 NA 3 17 10 NA 3 18 10 NA 3 19 10 NA 3 20 10 NA 3
To exclude NA’s from df1 and combine the columns, add the following code to the above snippet −
x1<-rep(c(NA,2,10),times=c(5,10,5)) x2<-rep(c(1,3,5,NA),times=c(5,5,5,5)) x3<-rep(c(10,NA,3),times=c(5,10,5)) df1<-data.frame(x1,x2,x3) t(apply(df1,1,na.exclude))
Output
If you execute all the above given snippets as a single program, it generates the following output −
[,1][,2] [1,] 1 10 [2,] 1 10 [3,] 1 10 [4,] 1 10 [5,] 1 10 [6,] 2 3 [7,] 2 3 [8,] 2 3 [9,] 2 3 [10,] 2 3 [11,] 2 5 [12,] 2 5 [13,] 2 5 [14,] 2 5 [15,] 2 5 [16,] 10 3 [17,] 10 3 [18,] 10 3 [19,] 10 3 [20,] 10 3
Example 2
Following snippet creates a sample data frame −
y1<-rep(c(NA,rnorm(5)),times=c(5,2,3,3,3,4)) y2<-rep(c(rnorm(2),NA),times=c(10,5,5)) y3<-rep(c(rnorm(1),NA,rnorm(1)),times=c(5,10,5)) df2<-data.frame(y1,y2,y3) df2
The following dataframe is created −
y1 y2 y3 1 NA 0.1152603 -0.9838989 2 NA 0.1152603 -0.9838989 3 NA 0.1152603 -0.9838989 4 NA 0.1152603 -0.9838989 5 NA 0.1152603 -0.9838989 6 -0.74142593 0.1152603 NA 7 -0.74142593 0.1152603 NA 8 -1.88274271 0.1152603 NA 9 -1.88274271 0.1152603 NA 10 -1.88274271 0.1152603 NA 11 -0.09684216 -1.2886519 NA 12 -0.09684216 -1.2886519 NA 13 -0.09684216 -1.2886519 NA 14 -0.08528031 -1.2886519 NA 15 -0.08528031 -1.2886519 NA 16 -0.08528031 NA 0.1967864 17 -0.80126932 NA 0.1967864 18 -0.80126932 NA 0.1967864 19 -0.80126932 NA 0.1967864 20 -0.80126932 NA 0.1967864
To exclude NA’s from df2 and combine the columns, add the following code to the above snippet −
y1<-rep(c(NA,rnorm(5)),times=c(5,2,3,3,3,4)) y2<-rep(c(rnorm(2),NA),times=c(10,5,5)) y3<-rep(c(rnorm(1),NA,rnorm(1)),times=c(5,10,5)) df2<-data.frame(y1,y2,y3) t(apply(df2,1,na.exclude))
Output
If you execute all the above given snippets as a single program, it generates the following output −
[,1] [,2] [1,] 0.11526026 -0.9838989 [2,] 0.11526026 -0.9838989 [3,] 0.11526026 -0.9838989 [4,] 0.11526026 -0.9838989 [5,] 0.11526026 -0.9838989 [6,] -0.74142593 0.1152603 [7,] -0.74142593 0.1152603 [8,] -1.88274271 0.1152603 [9,] -1.88274271 0.1152603 [10,] -1.88274271 0.1152603 [11,] -0.09684216 -1.2886519 [12,] -0.09684216 -1.2886519 [13,] -0.09684216 -1.2886519 [14,] -0.08528031 -1.2886519 [15,] -0.08528031 -1.2886519 [16,] -0.08528031 0.1967864 [17,] -0.80126932 0.1967864 [18,] -0.80126932 0.1967864 [19,] -0.80126932 0.1967864 [20,] -0.80126932 0.1967864
- Related Articles
- Combine two columns by ignoring missing values if exists in one column in R data frame.
- How to find the number of unique values in a vector by excluding missing values in R?
- How to select columns in R without missing values?
- How to find the length of columns for missing values in R?
- How to add two columns if both contains missing values in R?
- How to find the row mean for columns in an R data frame by ignoring missing values?
- How to calculate row means by excluding NA values in an R data frame?
- How to filter rows by excluding a particular value in columns of the R data frame?
- How to create the plot of a vector that contains missing values by adding the missing values in base R?
- How to combine matrices having same number of columns in R?
- How to combine two columns of a data.table object in R?
- Combine values of two columns separated with hyphen in an R data frame.
- How to impute missing values by random value for a single column in R?
- How to convert a column with missing values to binary with 0 for missing values in R?
- How to divide data frame row values by maximum value in each row excluding 0 in R?
