- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to standardize data.table object column by group in R?
To standardize data.table object column by group, we can use scale function and provide the grouping column with by function.
For Example, if we have a data.table object called DT that contains two columns say G and Num where G is a grouping column and Num is a numerical column then we can standardize Num by column G by using the below given command −
DT[,"Num":=as.vector(scale(Num)),by=G]
Example 1
Consider the below data.table object −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1
The following dataframe is created
Grp Response 1: Female 5.31 2: Male 5.20 3: Female 6.38 4: Male 4.53 5: Female 4.90 6: Female 4.78 7: Male 3.73 8: Female 6.19 9: Male 4.33 10: Male 7.84 11: Male 6.70 12: Female 5.11 13: Male 6.80 14: Male 3.76 15: Male 3.56 16: Male 5.51 17: Female 6.58 18: Female 7.59 19: Male 4.62 20: Female 6.75
To standardize Response column by Grp column in DT1 on the above created data frame, add the following code to the above snippet −
library(data.table) Grp<-sample(c("Male","Female"),20,replace=TRUE) Response<-round(rnorm(20,5,1.25),2) DT1<-data.table(Grp,Response) DT1[,"Response":=as.vector(scale(Response)),by=Grp] DT1
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Grp Response 1: Female -0.66313371 2: Male 0.03955265 3: Female 0.43789692 4: Male -0.43061348 5: Female -1.08502396 6: Female -1.20850403 7: Male -0.99200587 8: Female 0.24238681 9: Male -0.57096158 10: Male 1.89214752 11: Male 1.09216337 12: Female -0.86893383 13: Male 1.16233742 14: Male -0.97095365 15: Male -1.11130175 16: Male 0.25709220 17: Female 0.64369704 18: Female 1.68298763 19: Male -0.36745684 20: Female 0.81862714
Example 2
Following snippet creates a sample data frame −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2
The following dataframe is created
Class Rate 1: II 10 2: III 9 3: II 10 4: II 10 5: III 10 6: III 9 7: III 8 8: II 10 9: II 11 10: III 9 11: I 9 12: II 11 13: III 13 14: II 10 15: III 12 16: I 8 17: II 9 18: I 10 19: III 9 20: II 10
To standardize Rate column by Class column in DT2 on the above created data frame, add the following code to the above snippet −
Class<-sample(c("I","II","III"),20,replace=TRUE) Rate<-round(rnorm(20,10,1.02),0) DT2<-data.table(Class,Rate) DT2[,"Rate":=as.vector(scale(Rate)),by=Class] DT2
Output
If you execute all the above given snippets as a single program, it generates the following Output −
Class Rate 1: II -0.18490007 2: III -0.50669175 3: II -0.18490007 4: II -0.18490007 5: III 0.07238454 6: III -0.50669175 7: III -1.08576803 8: II -0.18490007 9: II 1.47920052 10: III -0.50669175 11: I 0.00000000 12: II 1.47920052 13: III 1.80961338 14: II -0.18490007 15: III 1.23053710 16: I -1.00000000 17: II -1.84900065 18: I 1.00000000 19: III -0.50669175 20: II -0.18490007
- Related Articles
- How to standardize columns in an R data frame?
- How to standardize selected columns in R data frame?
- How to standardize selected columns in data.table object in R?
- How to create a group column in an R data frame?
- How to standardize multiple columns not all in data.table object in R?
- GROUP BY a column in another MySQL table
- How to standardize matrix elements in R?
- How to standardize columns if some columns are categorical in R data frame?
- How to split a data frame by column in R?
- How to repeat column values in R data frame by values in another column?
- How to create group names for consecutively duplicate values in an R data frame column?
- How to drop data frame columns in R by using column name?
- How to standardize only numerical columns in an R data frame if categorical columns also exist?
- How to create pivot table with sum for data stored in data.table object in R?
- How to find the total by year column in an R data frame?
