How to find the sum of every n values if missing values exists in the R data frame?


To find the sum of every n values in R data frame columns if there exist missing values, we can use rowsum function along with rep function that will repeat the sum for rows and na.rm=TRUE to exclude the rows with missing values. For example, if we have a data frame called df that contains 4 columns each containing twenty values with some missing values then we can find the row sums for every 5 rows by using the command rowsum(df,rep(1:5,each=4),na.rm=TRUE).

Example

 Live Demo

x1<-sample(c(NA,rpois(2,5)),20,replace=TRUE)
x2<-sample(c(NA,rpois(2,10)),20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

Output

    x1   x2
1   4    10
2   5    10
3  NA     7
4   4     7
5   4    NA
6   5     7
7   4    NA
8  NA    NA
9   5    10
10  4    NA
11  NA   NA
12  5    NA
13  4    NA
14  4    NA
15  5    NA
16 NA    10
17 NA    NA
18  4    7
19  5    7
20 NA    NA

Finding the column sums for every 5 rows in df1 if missing values exists in the data frame −

Example

rowsum(df1,rep(1:5,each=4),na.rm=TRUE)

Output

   x1   x2
1  13   34
2  13    7
3  14   10
4  13   10
5   9   14

Example

 Live Demo

y1<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y2<-sample(c(NA,rnorm(2)),20,replace=TRUE)
y3<-sample(c(NA,rnorm(2)),20,replace=TRUE)
df2<-data.frame(y1,y2,y3)
df2

Output

       y1           y2         y3
1   0.9563337   -1.1412663    0.1873961
2   2.4693175    0.5661012    0.1873961
3    NA           0.5661012   NA
4   2.4693175    NA           0.4860115
5  NA            NA           NA
6   0.9563337    NA           NA
7   0.9563337   -1.1412663   0.1873961
8   2.4693175   -1.1412663    NA
9   0.9563337    NA           0.1873961
10  2.4693175    0.5661012    0.4860115
11 NA            NA           0.4860115
12 NA           -1.1412663    0.1873961
13 NA           -1.1412663    NA
14 NA            NA           0.1873961
15  0.9563337   -1.1412663   0.4860115
16  0.9563337   -1.1412663   0.1873961
17 NA            NA          0.4860115
18  0.9563337   -1.1412663   NA
19  0.9563337   -1.1412663   NA
20  2.4693175    0.5661012   0.4860115

Finding the column sums for every 5 rows in df2 if missing values exists in the data frame −

Example

rowsum(df2,rep(1:5,each=4),na.rm=TRUE)

Output

     y1           y2          y3
1  5.894969  -0.009063996  0.8608037
2  4.381985  -2.282532593  0.1873961
3  3.425651  -0.575165146  1.3468152
4  1.912667  -3.423798889  0.8608037
5  4.381985  -1.716431443  0.9720230

Updated on: 17-Mar-2021

499 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements