- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to standardize data.table object column by group in R?
To standardize data.table object column by group, we can use scale function and provide the grouping column with by function.
For Example, if we have a data.table object called DT that contains two columns say G and Num where G is a grouping column and Num is a numerical column then we can standardize Num by column G by using the below given command −
DT[,"Num":=as.vector(scale(Num)),by=G]
Example 1
Consider the below data.table object −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1
The following dataframe is created
Grp Response 1: Female 5.31 2: Male 5.20 3: Female 6.38 4: Male 4.53 5: Female 4.90 6: Female 4.78 7: Male 3.73 8: Female 6.19 9: Male 4.33 10: Male 7.84 11: Male 6.70 12: Female 5.11 13: Male 6.80 14: Male 3.76 15: Male 3.56 16: Male 5.51 17: Female 6.58 18: Female 7.59 19: Male 4.62 20: Female 6.75
To standardize Response column by Grp column in DT1 on the above created data frame, add the following code to the above snippet −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1[,"Response":=as.vector(scale(Response)),by=Grp] DT1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Grp Response 1: Female -0.66313371 2: Male 0.03955265 3: Female 0.43789692 4: Male -0.43061348 5: Female -1.08502396 6: Female -1.20850403 7: Male -0.99200587 8: Female 0.24238681 9: Male -0.57096158 10: Male 1.89214752 11: Male 1.09216337 12: Female -0.86893383 13: Male 1.16233742 14: Male -0.97095365 15: Male -1.11130175 16: Male 0.25709220 17: Female 0.64369704 18: Female 1.68298763 19: Male -0.36745684 20: Female 0.81862714
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2
The following dataframe is created
Class Rate 1: II 10 2: III 9 3: II 10 4: II 10 5: III 10 6: III 9 7: III 8 8: II 10 9: II 11 10: III 9 11: I 9 12: II 11 13: III 13 14: II 10 15: III 12 16: I 8 17: II 9 18: I 10 19: III 9 20: II 10
To standardize Rate column by Class column in DT2 on the above created data frame, add the following code to the above snippet −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2[,"Rate":=as.vector(scale(Rate)),by=Class] DT2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Rate 1: II -0.18490007 2: III -0.50669175 3: II -0.18490007 4: II -0.18490007 5: III 0.07238454 6: III -0.50669175 7: III -1.08576803 8: II -0.18490007 9: II 1.47920052 10: III -0.50669175 11: I 0.00000000 12: II 1.47920052 13: III 1.80961338 14: II -0.18490007 15: III 1.23053710 16: I -1.00000000 17: II -1.84900065 18: I 1.00000000 19: III -0.50669175 20: II -0.18490007