How to add two columns if both contains missing values in R?


If we want to add two columns of an R data frame and each of them contains missing values then the addition of columns can be done in one of the following ways −

  • Adding both the column values if they are numeric.

  • Returning numeric if one of the columns has missing value.

  • Returning NA if both the columns have missing value.

To this we can take help of apply function and ifelse function as shown in the below given examples.

Example 1

Following snippet creates a sample data frame −

x1<-sample(c(NA,1,2),20,replace=TRUE)
x2<-sample(c(NA,5),20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

The following dataframe is created −

    x1 x2
1   1  NA
2   1   5
3   2   5
4   1   5
5   2   5
6  NA  NA
7   2  NA
8   1  NA
9   1   5
10 NA  NA
11  1  NA
12  2  NA
13 NA  NA
14 NA  NA
15  1  NA
16  1   5
17 NA  NA
18  1  NA
19  2  NA
20 NA   5

To add columns of df1, add the following code to the above snippet −

x1<-sample(c(NA,1,2),20,replace=TRUE)
x2<-sample(c(NA,5),20,replace=TRUE)
df1<-data.frame(x1,x2)
df1$Sum<-apply(cbind(df1$x1,df1$x2),1,function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=T)))
df1

Output

If you execute all the above given snippets as a single program, it generates the following output: −

    x1  x2 Sum
1   1  NA   1
2   1   5   6
3   2   5   7
4   1   5   6
5   2   5   7
6  NA  NA  NA
7   2  NA   2
8   1  NA   1
9   1   5   6
10 NA  NA  NA
11  1  NA   1
12  2  NA   2
13 NA  NA  NA
14 NA  NA  NA
15  1  NA   1
16  1   5   6
17 NA  NA  NA
18  1  NA   1
19  2  NA   2
20 NA   5   5

Example 2

Following snippet creates a sample data frame −

y1<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y2<-sample(c(NA,rnorm(2)),20,replace=TRUE)
df2<-data.frame(y1,y2)
df2

The following dataframe is created −

      y1              y2
1   0.5109281     -0.6697566
2   NA            -0.1898259
3   NA            -0.1898259
4  -0.9540862     -0.6697566
5   NA            -0.1898259
6   NA            -0.1898259
7   NA             NA
8   0.5109281     -0.6697566
9   NA             NA
10  0.5109281      NA
11  0.5109281     -0.6697566
12 -0.9540862      NA
13 -0.9540862     -0.6697566
14  NA            -0.1898259
15  0.5109281     -0.1898259
16  NA            -0.1898259
17  0.5109281      NA
18  NA            -0.6697566
19  NA            -0.1898259
20  NA             NA

To add columns of df2, add the following code to the above snippet −

y1<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y2<-sample(c(NA,rnorm(2)),20,replace=TRUE)
df2<-data.frame(y1,y2)
df2$Sum<-apply(cbind(df2$y1,df2$y2),1,function(x) ifelse(all(is.na(x)),NA,sum(x,na.rm=T)))
df2

Output

If you execute all the above given snippets as a single program, it generates the following output: −

       y1             y2          Sum
1   0.5109281      -0.6697566  -0.1588286
2   NA             -0.1898259  -0.1898259
3   NA             -0.1898259  -0.1898259
4  -0.9540862      -0.6697566  -1.6238429
5   NA             -0.1898259  -0.1898259
6   NA             -0.1898259  -0.1898259
7   NA              NA          NA
8   0.5109281      -0.6697566  -0.1588286
9   NA              NA          NA
10  0.5109281       NA          0.5109281
11  0.5109281      -0.6697566  -0.1588286
12 -0.9540862       NA         -0.9540862
13 -0.9540862      -0.6697566  -1.6238429
14  NA             -0.1898259  -0.1898259
15  0.5109281      -0.1898259   0.3211022
16  NA             -0.1898259  -0.1898259
17  0.5109281       NA          0.5109281
18  NA             -0.6697566  -0.6697566
19  NA             -0.1898259  -0.1898259
20  NA              NA          NA

Updated on: 10-Nov-2021

704 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements