- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to standardized a column in an R data frame?
The standardization means converting a vector or column of an R data frame in a way such that the mean of the same becomes 0 and the standard deviation becomes 1, that is it should be converted to standard normal distribution. In R, it can be easily done with the help of scale function. Check out the below example to understand how it is done.
Example
Consider the below data frame:
> set.seed(3665) > x1<-rnorm(20,1,0.35) > x2<-rnorm(20,50,1.25) > x3<-rnorm(20,125,10.27) > x4<-rpois(20,5) > x5<-runif(20,1,5) > x6<-rexp(20,1.35) > df<-data.frame(x1,x2,x3,x4,x5,x6) > df
Output
x1 x2 x3 x4 x5 x6 1 1.3958185 49.39843 128.5224 3 4.183664 2.33406246 2 1.0467979 48.90103 120.5796 7 3.526731 0.02043217 3 0.9190516 50.74664 110.4765 6 2.145181 0.04268455 4 1.1196425 47.83063 126.3711 9 4.276084 0.87234197 5 1.0033896 51.31879 144.2594 5 3.308073 0.28540083 6 0.7571435 49.92559 109.9660 5 2.349070 0.09613835 7 0.8266129 48.93754 135.5895 3 2.479160 0.15018153 8 1.2786206 50.27384 122.8543 4 4.343062 1.26431542 9 0.8661156 50.36976 122.9482 7 3.517678 0.24045191 10 0.9237285 48.55069 121.6440 4 1.619902 0.72327013 11 0.8191029 49.27937 111.8696 3 4.760655 0.97199973 12 1.2619135 50.91131 129.0021 4 3.355301 1.42184615 13 1.5297983 49.38604 133.4756 1 2.977833 0.50042231 14 0.7858227 47.92899 142.0669 3 3.262058 0.37260602 15 0.5626517 51.22160 107.5586 2 3.194546 0.21176125 16 1.2106700 51.65911 132.4945 3 1.088987 0.78318970 17 1.5351378 48.60769 116.1427 7 3.423079 0.65904040 18 0.4134951 52.09415 125.1567 5 4.309763 0.03658430 19 1.0490230 49.87242 125.6695 6 2.255468 1.32173240 20 0.9521718 50.96409 131.8025 1 4.312514 0.26955446
Standardizing columns of data frame df:
Example
> scale(df$x1)
Output
[,1] [1,] 1.29357316 [2,] 0.11471258 [3,] -0.31676647 [4,] 0.36075419 [5,] -0.03190435 [6,] -0.86363126 [7,] -0.62898965 [8,] 0.89772243 [9,] -0.49556454 [10,] -0.30096951 [11,] -0.65435564 [12,] 0.84129223 [13,] 1.74610668 [14,] -0.76676371 [15,] -1.52055135 [16,] 0.66821113 [17,] 1.76414130 [18,] -2.02434637 [19,] 0.12222816 [20,] -0.20489902 attr(,"scaled:center") [1] 1.012835 attr(,"scaled:scale") [1] 0.2960661
Example
> scale(df$x2)
Output
[,1] [1,] -0.41047444 [2,] -0.81044801 [3,] 0.67365908 [4,] -1.67118308 [5,] 1.13374554 [6,] 0.01343048 [7,] -0.78108246 [8,] 0.29346832 [9,] 0.37059777 [10,] -1.09216479 [11,] -0.50620790 [12,] 0.80607839 [13,] -0.42043493 [14,] -1.59208513 [15,] 1.05558665 [16,] 1.40740638 [17,] -1.04632509 [18,] 1.75723305 [19,] -0.02932018 [20,] 0.84852032 attr(,"scaled:center") [1] 49.90889 attr(,"scaled:scale") [1] 1.243585
Example
> scale(df$x3)
Output
[,1] [1,] 0.34992259 [2,] -0.42214208 [3,] -1.40419396 [4,] 0.14081369 [5,] 1.87961161 [6,] -1.45381494 [7,] 1.03687396 [8,] -0.20103605 [9,] -0.19190685 [10,] -0.31867565 [11,] -1.26878469 [12,] 0.39654942 [13,] 0.83139424 [14,] 1.66649197 [15,] -1.68782930 [16,] 0.73602324 [17,] -0.85342655 [18,] 0.02276361 [19,] 0.07260960 [20,] 0.66875614 attr(,"scaled:center") [1] 124.9225 attr(,"scaled:scale") [1] 10.28772
Example
> scale(df$x4)
Output
[,1] [1,] -0.6550055 [2,] 1.2164389 [3,] 0.7485778 [4,] 2.1521611 [5,] 0.2807167 [6,] 0.2807167 [7,] -0.6550055 [8,] -0.1871444 [9,] 1.2164389 [10,] -0.1871444 [11,] -0.6550055 [12,] -0.1871444 [13,] -1.5907277 [14,] -0.6550055 [15,] -1.1228666 [16,] -0.6550055 [17,] 1.2164389 [18,] 0.2807167 [19,] 0.7485778 [20,] -1.5907277 attr(,"scaled:center") [1] 4.4 attr(,"scaled:scale") [1] 2.137387
Example
> scale(df$x5)
Output
[,1] [1,] 0.95270619 [2,] 0.29336256 [3,] -1.09325500 [4,] 1.04546450 [5,] 0.07390313 [6,] -0.88861840 [7,] -0.75805159 [8,] 1.11268817 [9,] 0.28427648 [10,] -1.62046133 [11,] 1.53181336 [12,] 0.12130380 [13,] -0.25754850 [14,] 0.02771942 [15,] -0.04004115 [16,] -2.15332455 [17,] 0.18933035 [18,] 1.07926704 [19,] -0.98256346 [20,] 1.08202896 attr(,"scaled:center") [1] 3.23444 attr(,"scaled:scale") [1] 0.9963449
Example
> scale(df$x6)
Output
[,1] [1,] 2.8372424 [2,] -1.0124395 [3,] -0.9754135 [4,] 0.4050652 [5,] -0.5715544 [6,] -0.8864709 [7,] -0.7965477 [8,] 1.0572753 [9,] -0.6463456 [10,] 0.1570224 [11,] 0.5708871 [12,] 1.3193929 [13,] -0.2137772 [14,] -0.4264525 [15,] -0.6940844 [16,] 0.2567235 [17,] 0.0501497 [18,] -0.9855638 [19,] 1.1528122 [20,] -0.5979214 attr(,"scaled:center") [1] 0.6289008 attr(,"scaled:scale") [1] 0.6009926
- Related Articles
- How to remove a column from an R data frame?
- How to extract a single column of an R data frame as a data frame?
- How to rename a single column in an R data frame?
- How to replace a complete column in an R data frame?
- How to remove a character in an R data frame column?
- How to create a group column in an R data frame?
- How to create a lagged column in an R data frame?
- How to check whether a column exists in an R data frame?
- How to sort a numerical factor column in an R data frame?
- How to match a column in a data frame with a column in another data frame in R?
- How to associate a character to numbers in an R data frame column?
- How to remove column names from an R data frame?
- How to find mode for an R data frame column?
- How to select positive values in an R data frame column?
- How to randomly replace values in an R data frame column?

Advertisements