How to standardize only numerical columns in an R data frame if categorical columns also exist?

R ProgrammingServer Side ProgrammingProgramming

The standardization of a numerical column can be easily done with the help of scale function but if we want to standardize multiple columns of a data frame if categorical columns also exist then mutate_if function of dplyr package will be used. For example, if we have a data frame df then it can be done as df%>%mutate_if(is.numeric,scale)

Example1

Consider the below data frame −

Live Demo

> x1<-sample(letters[1:4],20,replace=TRUE)
> x2<-rpois(20,2)
> df1<-data.frame(x1,x2)
> df1

Output

   x1 x2
1   c  4
2   c  1
3   a  4
4   a  1
5   b  0
6   c  4
7   c  2
8   a  1
9   c  2
10  d  2
11  b  0
12  b  3
13  c  0
14  d  1
15  a  2
16  d  1
17  a  2
18  d  2
19  c  1
20  a  3

Loading dplyr package and standardizing numerical columns in df1 −

> library(dplyr)
> df1%>%mutate_if(is.numeric,scale)

Output

   x1         x2
1   c  1.7168098
2   c -0.6242945
3   a  1.7168098
4   a -0.6242945
5   b -1.4046626
6   c  1.7168098
7   c  0.1560736
8   a -0.6242945
9   c  0.1560736
10  d  0.1560736
11  b -1.4046626
12  b  0.9364417
13  c -1.4046626
14  d -0.6242945
15  a  0.1560736
16  d -0.6242945
17  a  0.1560736
18  d  0.1560736
19  c -0.6242945
20  a  0.9364417

Example2

Live Demo

> y1<-sample(c("S1","S2","S3"),20,replace=TRUE)
> y2<-rnorm(20,34,2.3)
> y3<-rnorm(20,500,47.1)
> df2<-data.frame(y1,y2,y3)
> df2

Output

   y1       y2       y3
1  S2 33.67237 511.9535
2  S2 30.47941 509.6286
3  S3 35.19967 605.8329
4  S2 27.82392 590.1114
5  S2 33.91328 485.1736
6  S1 38.26157 449.6714
7  S3 32.46148 495.2131
8  S3 32.06987 477.6192
9  S2 33.32162 448.6335
10 S2 37.55487 544.3631
11 S2 34.84706 462.9035
12 S1 34.59332 532.0554
13 S2 32.36337 501.9207
14 S2 32.26520 516.7858
15 S3 33.62168 530.5313
16 S3 33.06213 515.0878
17 S1 35.09752 454.7614
18 S3 31.79898 499.8527
19 S1 32.85342 509.8768
20 S3 33.72336 503.8084

Standardizing numerical columns in df2 −

> df2%>%mutate_if(is.numeric,scale)

Output

   y1          y2          y3
1  S2  0.09796633  0.11297890
2  S2 -1.30368623  0.05666468
3  S3  0.76842187  2.38692048
4  S2 -2.46939699  2.00611458
5  S2  0.20372057 -0.53568372
6  S1  2.11253906 -1.39561547
7  S3 -0.43359265 -0.29250727
8  S3 -0.60550146 -0.71866529
9  S2 -0.05600808 -1.42075459
10 S2  1.80231017  0.89800290
11 S2  0.61363310 -1.07510811
12 S1  0.50224659  0.59988493
13 S2 -0.47666141 -0.13003510
14 S2 -0.51975777  0.23002594
15 S3  0.07571152  0.56296787
16 S3 -0.16991946  0.18889687
17 S1  0.72358127 -1.27232444
18 S3 -0.72441871 -0.18012673
19 S1 -0.26153720  0.06267550
20 S3  0.12034948 -0.08431193
raja
Published on 05-Mar-2021 07:21:06
Advertisements