- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to create a new column with a subset of row sums in an R data frame?
In data analysis, there are many situations we have to deal with and one of them is creating a new column that has the row sums of only some rows. These sums will be repeated so that we get the total number of values equal to the number of rows in the data frame. We can use rowSums with rep function to create such type of columns.
Example
Consider the below data frame −
> set.seed(99) > x1<-rnorm(20,0.5) > x2<-rpois(20,2) > x3<-runif(20,2,10) > x4<-rnorm(20,0.2) > x5<-rpois(20,5) > df<-data.frame(x1,x2,x3,x4,x5) > df x1 x2 x3 x4 x5 1 0.7139625 4 9.321058 0.33297863 4 2 0.9796581 2 4.298837 -1.47926432 11 3 0.5878287 3 7.389898 -0.07847958 5 4 0.9438585 4 7.873764 -1.35241100 6 5 0.1371621 2 5.534758 -1.17969925 4 6 0.6226740 4 8.786676 -1.15705659 5 7 -0.3638452 1 6.407712 -0.72113718 5 8 0.9896243 2 9.374095 -0.66681774 9 9 0.1358831 2 2.086996 1.85664439 3 10 -0.7942420 0 8.730721 0.04492028 3 11 -0.2457690 3 2.687042 -1.37655243 2 12 1.4215504 3 7.075115 0.82408260 4 13 1.2500544 3 5.373809 0.53022068 5 14 -2.0085540 5 5.287499 -0.19812226 12 15 -2.5409341 1 6.217131 -0.88139693 5 16 0.5002658 3 2.723290 0.12307794 6 17 0.1059810 0 6.288451 -0.32553662 4 18 -1.2450277 2 2.942365 0.59128965 5 19 0.9986315 4 7.012492 -0.48045326 6 20 0.7709538 1 7.801093 -0.54869693 5
Suppose that we want to create a new column which has the row sums of first five rows repeated up to the total number of rows. It can be done as follows −
> df$x6<-rep(c(rowSums(df[1:5,])),times=4) > df x1 x2 x3 x4 x5 x6 1 0.7139625 4 9.321058 0.33297863 4 18.36800 2 0.9796581 2 4.298837 -1.47926432 11 16.79923 3 0.5878287 3 7.389898 -0.07847958 5 15.89925 4 0.9438585 4 7.873764 -1.35241100 6 17.46521 5 0.1371621 2 5.534758 -1.17969925 4 10.49222 6 0.6226740 4 8.786676 -1.15705659 5 18.36800 7 -0.3638452 1 6.407712 -0.72113718 5 16.79923 8 0.9896243 2 9.374095 -0.66681774 9 15.89925 9 0.1358831 2 2.086996 1.85664439 3 17.46521 10 -0.7942420 0 8.730721 0.04492028 3 10.49222 11 -0.2457690 3 2.687042 -1.37655243 2 18.36800 12 1.4215504 3 7.075115 0.82408260 4 16.79923 13 1.2500544 3 5.373809 0.53022068 5 15.89925 14 -2.0085540 5 5.287499 -0.19812226 12 17.46521 15 -2.5409341 1 6.217131 -0.88139693 5 10.49222 16 0.5002658 3 2.723290 0.12307794 6 18.36800 17 0.1059810 0 6.288451 -0.32553662 4 16.79923 18 -1.2450277 2 2.942365 0.59128965 5 15.89925 19 0.9986315 4 7.012492 -0.48045326 6 17.46521 20 0.7709538 1 7.801093 -0.54869693 5 10.49222
Just look at the column x6, you can see that the values after fifth row are repeated. We can do the same for the column sums instead of row sums.
Advertisements