How to convert a data frame to a matrix if the data frame contains factor variable as strings in R?

R ProgrammingServer Side ProgrammingProgramming

A matrix contains only numeric values, therefore, if we will convert a data frame that has factor variables as strings then the factor levels will be converted to numbers. These numbering is based on the first character of the factor level, for example, if the string starts with an A then it will get 1, and so on. To convert a data frame to a matrix if the data frame contains factor variable as strings, we need to read the data frame as matrix.

Example

Consider the below data frame −

x1<-1:10
x2<-10:1
x3<-letters[1:10]
x4<-LETTERS[1:10]
x5<-letters[10:1]
x6<-LETTERS[10:1]
x7<-rnorm(10)
x8<-rnorm(10,0.2)
x9<-rnorm(10,0.5)
x10<-rnorm(10,1)
df<-data.frame(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10)
str(df)

Output

'data.frame': 10 obs. of 10 variables:
$ x1 : int 1 2 3 4 5 6 7 8 9 10
$ x2 : int 10 9 8 7 6 5 4 3 2 1
$ x3 : Factor w/ 10 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10
$ x4 : Factor w/ 10 levels "A","B","C","D",..: 1 2 3 4 5 6 7 8 9 10
$ x5 : Factor w/ 10 levels "a","b","c","d",..: 10 9 8 7 6 5 4 3 2 1
$ x6 : Factor w/ 10 levels "A","B","C","D",..: 10 9 8 7 6 5 4 3 2 1
$ x7 : num 0.526 -0.795 1.428 -1.467 -0.237 ...
$ x8 : num 0.0362 0.9085 -0.068 -1.2639 0.9444 ...
$ x9 : num 1.395 0.779 1.508 -1.573 1.69 ...
$ x10: num 1.482 1.758 -1.319 0.54 -0.105 ...
 df
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10
1 1 10 a A j J 0.5264481 0.03624433 1.3949372 1.4824588
2 2 9 b B i I -0.7948444 0.90852210 0.7791520 1.7582138
3 3 8 c C h H 1.4277555 -0.06798055 1.5078658 -1.3193274
4 4 7 d D g G -1.4668197 -1.26392176 -1.5731065 0.5404952
5 5 6 e E f F -0.2366834 0.94443582 1.6898534 -0.1053837
6 6 5 f F e E -0.1933380 -1.21039018 -0.2243742 1.4029283
7 7 4 g G d D -0.8497547 0.66706761 0.6679838 1.5689349
8 8 3 h H c C 0.0584655 0.08067989 1.4203352 0.2939167
9 9 2 i I b B -0.8176704 0.66723896 -1.1716048 0.7099094
10 10 1 j J a A -2.0503078 0.69813556 0.9484691 -0.4838781

Converting the data frame df to matrix −

Example

matrix(as.numeric(unlist(df)),nrow=nrow(df))

Output

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 10 1 1 10 10 0.5264481 0.03624433 1.3949372
[2,] 2 9 2 2 9 9 -0.7948444 0.90852210 0.7791520
[3,] 3 8 3 3 8 8 1.4277555 -0.06798055 1.5078658
[4,] 4 7 4 4 7 7 -1.4668197 -1.26392176 -1.5731065
[5,] 5 6 5 5 6 6 -0.2366834 0.94443582 1.6898534
[6,] 6 5 6 6 5 5 -0.1933380 -1.21039018 -0.2243742
[7,] 7 4 7 7 4 4 -0.8497547 0.66706761 0.6679838
[8,] 8 3 8 8 3 3 0.0584655 0.08067989 1.4203352
[9,] 9 2 9 9 2 2 -0.8176704 0.66723896 -1.1716048
[10,] 10 1 10 10 1 1 -2.0503078 0.69813556 0.9484691
[,10]
[1,] 1.4824588
[2,] 1.7582138
[3,] -1.3193274
[4,] 0.5404952
[5,] -0.1053837
[6,] 1.4029283
[7,] 1.5689349
[8,] 0.2939167
[9,] 0.7099094
[10,] -0.4838781

Let’s have a look at another example −

Example

y1<-c("Age","Sex","Salary","Education","Ethnicity")
y2<-1:5
y3<-c(24,15,48,72,29)
df_y<-data.frame(y1,y2,y3)
df_y

Output

y1 y2 y3
1 Age 1 24
2 Sex 2 15
3 Salary 3 48
4 Education 4 72
5 Ethnicity 5 29

Example

matrix(as.numeric(unlist(df_y)),nrow=5)

Output

[,1] [,2] [,3]
[1,] 1 1 24
[2,] 5 2 15
[3,] 4 3 48
[4,] 2 4 72
[5,] 3 5 29
raja
Published on 21-Aug-2020 11:49:54
Advertisements