How to create a new column with a subset of row sums in an R data frame?

R ProgrammingServer Side ProgrammingProgramming

In data analysis, there are many situations we have to deal with and one of them is creating a new column that has the row sums of only some rows. These sums will be repeated so that we get the total number of values equal to the number of rows in the data frame. We can use rowSums with rep function to create such type of columns.

Example

Consider the below data frame −

> set.seed(99)
> x1<-rnorm(20,0.5)
> x2<-rpois(20,2)
> x3<-runif(20,2,10)
> x4<-rnorm(20,0.2)
> x5<-rpois(20,5)
> df<-data.frame(x1,x2,x3,x4,x5)
> df
x1 x2 x3 x4 x5
1 0.7139625 4 9.321058 0.33297863 4
2 0.9796581 2 4.298837 -1.47926432 11
3 0.5878287 3 7.389898 -0.07847958 5
4 0.9438585 4 7.873764 -1.35241100 6
5 0.1371621 2 5.534758 -1.17969925 4
6 0.6226740 4 8.786676 -1.15705659 5
7 -0.3638452 1 6.407712 -0.72113718 5
8 0.9896243 2 9.374095 -0.66681774 9
9 0.1358831 2 2.086996 1.85664439 3
10 -0.7942420 0 8.730721 0.04492028 3
11 -0.2457690 3 2.687042 -1.37655243 2
12 1.4215504 3 7.075115 0.82408260 4
13 1.2500544 3 5.373809 0.53022068 5
14 -2.0085540 5 5.287499 -0.19812226 12
15 -2.5409341 1 6.217131 -0.88139693 5
16 0.5002658 3 2.723290 0.12307794 6
17 0.1059810 0 6.288451 -0.32553662 4
18 -1.2450277 2 2.942365 0.59128965 5
19 0.9986315 4 7.012492 -0.48045326 6
20 0.7709538 1 7.801093 -0.54869693 5

Suppose that we want to create a new column which has the row sums of first five rows repeated up to the total number of rows. It can be done as follows −

> df$x6<-rep(c(rowSums(df[1:5,])),times=4)
> df
x1 x2 x3 x4 x5 x6
1 0.7139625 4 9.321058 0.33297863 4 18.36800
2 0.9796581 2 4.298837 -1.47926432 11 16.79923
3 0.5878287 3 7.389898 -0.07847958 5 15.89925
4 0.9438585 4 7.873764 -1.35241100 6 17.46521
5 0.1371621 2 5.534758 -1.17969925 4 10.49222
6 0.6226740 4 8.786676 -1.15705659 5 18.36800
7 -0.3638452 1 6.407712 -0.72113718 5 16.79923
8 0.9896243 2 9.374095 -0.66681774 9 15.89925
9 0.1358831 2 2.086996 1.85664439 3 17.46521
10 -0.7942420 0 8.730721 0.04492028 3 10.49222
11 -0.2457690 3 2.687042 -1.37655243 2 18.36800
12 1.4215504 3 7.075115 0.82408260 4 16.79923
13 1.2500544 3 5.373809 0.53022068 5 15.89925
14 -2.0085540 5 5.287499 -0.19812226 12 17.46521
15 -2.5409341 1 6.217131 -0.88139693 5 10.49222
16 0.5002658 3 2.723290 0.12307794 6 18.36800
17 0.1059810 0 6.288451 -0.32553662 4 16.79923
18 -1.2450277 2 2.942365 0.59128965 5 15.89925
19 0.9986315 4 7.012492 -0.48045326 6 17.46521
20 0.7709538 1 7.801093 -0.54869693 5 10.49222

Just look at the column x6, you can see that the values after fifth row are repeated. We can do the same for the column sums instead of row sums.

raja
Published on 11-Aug-2020 12:41:05
Advertisements