How to convert a data frame with categorical columns to numeric in R?

R ProgrammingServer Side ProgrammingProgramming

We might want to convert categorical columns to numeric for reasons such as parametric results of the ordinal or nominal data. If we have categorical columns and the values are represented by using letters/words then the conversion will be based on the first character of the category. To understand the conversion, check out the below examples.

Example1

 Live Demo

Consider the below data frame −

set.seed(100)
x1<−sample(LETTERS[1:4],20,replace=TRUE)
x2<−sample(LETTERS[1:4],20,replace=TRUE)
x3<−sample(LETTERS[1:4],20,replace=TRUE)
x4<−sample(LETTERS[1:4],20,replace=TRUE)
df1<−data.frame(x1,x2,x3,x4)
df1

Output

x1 x2 x3 x4
1 B C C B
2 C D A A
3 B B D A
4 D A C A
5 C D D B
6 A C B D
7 B C B C
8 B D A C
9 D B A C
10 C A B A
11 D B B A
12 B C A B
13 B D C D
14 D D C B
15 C B A C
16 B D C A
17 B D A B
18 C D D D
19 C A C C
20 C C C B

Converting columns in df1 to numerical −

Example

df1[]<−as.numeric(factor(as.matrix(df1)))
df1

Output

x1 x2 x3 x4
1 2 3 3 2
2 3 4 1 1
3 2 2 4 1
4 4 1 3 1
5 3 4 4 2
6 1 3 2 4
7 2 3 2 3
8 2 4 1 3
9 4 2 1 3
10 3 1 2 1
11 4 2 2 1
12 2 3 1 2
13 2 4 3 4
14 4 4 3 2
15 3 2 1 3
16 2 4 3 1
17 2 4 1 2
18 3 4 4 4
19 3 1 3 3
20 3 3 3 2

Example2

 Live Demo

y1<−sample(c("Hot","Cold","Bitter"),20,replace=TRUE)
y2<−sample(c("Hot","Cold","Bitter"),20,replace=TRUE)
y3<−sample(c("Hot","Cold","Bitter"),20,replace=TRUE)
df2<−data.frame(y1,y2,y3)
df2

Output

y1 y2 y3
1 Bitter Hot Cold
2 Bitter Cold Hot
3 Bitter Bitter Cold
4 Cold Hot Bitter
5 Bitter Cold Cold
6 Cold Hot Bitter
7 Cold Cold Cold
8 Hot Cold Bitter
9 Bitter Bitter Bitter
10 Bitter Hot Bitter
11 Bitter Cold Cold
12 Bitter Bitter Hot
13 Hot Bitter Bitter
14 Cold Bitter Cold
15 Cold Bitter Bitter
16 Hot Bitter Hot
17 Bitter Cold Cold
18 Hot Cold Bitter
19 Hot Hot Cold
20 Hot Bitter Cold

Converting columns in df2 to numerical −

Example

df2[]<−as.numeric(factor(as.matrix(df2)))
df2

Output

y1 y2 y3
1 1 3 2
2 1 2 3
3 1 1 2
4 2 3 1
5 1 2 2
6 2 3 1
7 2 2 2
8 3 2 1
9 1 1 1
10 1 3 1
11 1 2 2
12 1 1 3
13 3 1 1
14 2 1 2
15 2 1 1
16 3 1 3
17 1 2 2
18 3 2 1
19 3 3 2
20 3 1 2

Here, first letter of the category is considered for numbering.

raja
Published on 09-Feb-2021 12:00:37
Advertisements