How to recode factors in R?


Sometimes we have factor levels that can be combined or we want to group those levels in a single level. It is mostly done in situations where we have only one value for a particular factor level or there exists some theoretical concept that leads to combining the factor levels. For example, if we have a data frame called df that contains a factor column say x having four categories A, B, C, and D then they can be grouped into A and B as −

df$x[df$x %in% c("A","B")]<-"A"
df$x[df$x %in% c("C","D")]<-"B"

Example

Consider the below data frame −

 Live Demo

factor<-sample(LETTERS[1:4],20,replace=TRUE)
response<-rpois(20,5)
df1<-data.frame(factor,response)
df1

Output

   factor response
1  A      5
2  C      7
3  D      5
4  C     13
5  C      5
6  C      4
7  B      4
8  B     10
9  C      4
10 D      6
11 B      5
12 B      3
13 A      7
14 A      2
15 A      2
16 D      3
17 B      1
18 C      5
19 D      6
20 D      4

Recoding factor levels in factor column of df1 −

df1$factor[df1$factor %in% c("A","B")]<-"A"
df1$factor[df1$factor %in% c("C","D")]<-"B"
df1

Output

 factor response
1  A     5
2  B     7
3  B     5
4  B    13
5  B     5
6  B     4
7  A     4
8  A    10
9  B     4
10 B     6
11 A     5
12 A     3
13 A     7
14 A     2
15 A     2
16 B     3
17 A     1
18 B     5
19 B     6
20 B     4

Example2

 Live Demo

grp<-sample(c("G1","G2","G3"),20,replace=TRUE)
Y<-rnorm(20)
df2<-data.frame(grp,Y)
df2

Output

   grp     Y
1  G3  -0.39900138
2  G3   1.04085657
3  G1   1.46432790
4  G3  -0.90843955
5  G1  -0.15202516
6  G2   1.15456629
7  G2   1.24002828
8  G2  -1.10731484
9  G2   0.27423208
10 G3   1.06444903
11 G2  -0.21824650
12 G1   0.25843090
13 G1   0.07686889
14 G3  -0.21955611
15 G3  -0.05359245
16 G2   0.54630987
17 G3  -0.09808820
18 G1  -0.65171471
19 G2  -0.62371231
20 G2  -0.03319190

Recoding factor levels in grp column of df2 −

df2$grp[df2$grp %in% c("G1","G2")]<-"Control"
df2
    grp      Y
1  G3      -0.39900138
2  G3       1.04085657
3  Control  1.46432790
4  G3      -0.90843955
5  Control -0.15202516
6  Control  1.15456629
7  Control  1.24002828
8  Control -1.10731484
9  Control  0.27423208
10 G3       1.06444903
11 Control -0.21824650
12 Control  0.25843090
13 Control  0.07686889
14 G3      -0.21955611
15 G3      -0.05359245
16 Control  0.54630987
17 G3      -0.09808820
18 Control -0.65171471
19 Control -0.62371231
20 Control -0.03319190

Updated on: 06-Mar-2021

169 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements