 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to convert multiple numerical variables to factor variable in R?
Sometimes the data type for a variable is not correct and it is very common that a factor variable is read as a numeric variable, especially in cases where factor levels are represented by numbers. If we do not change the data type of a factor variable then the result of the analysis will be incorrect. Therefore, if a factor variable has a different data type than factor then it must be converted to factor data type. To convert multiple variables to factor type, we can create a vector that will have the name of all factor variables then using lapply to convert them to factor.
Example
Consider the below data frame −
> set.seed(123) > x1<-rep(c(1,2,3,4,5),times=4) > x2<-rep(c(5,10,15,20),each=5) > x3<-sample(1:10,20,replace=TRUE) > x4<-sample(1:100,20) > x5<-rep(c(LETTERS[1:5]),times=4) > x6<-rnorm(20,1) > x7<-runif(20,2,10) > df<-data.frame(x1,x2,x3,x4,x5,x6,x7) > df x1 x2 x3 x4 x5 x6 x7 1 1 5 3 9 A 0.18148428 3.501529 2 2 5 3 83 B 1.68493608 8.258354 3 3 5 10 36 C 0.67994358 2.748760 4 4 5 2 78 D -0.31152241 5.734232 5 5 5 6 81 E 0.40039167 6.092044 6 1 10 5 43 A 0.87058931 6.799912 7 2 10 4 76 B 1.88673615 4.662588 8 3 10 6 15 C 0.84860404 5.908904 9 4 10 9 32 D 1.32979120 9.635791 10 5 10 10 7 E -2.22732283 5.863219 11 1 15 5 100 A 0.22820823 9.122802 12 2 15 3 41 B 1.28654857 9.315505 13 3 15 9 74 C -0.22051198 6.869880 14 4 15 9 23 D 1.43455038 5.285518 15 5 15 9 27 E 1.80017687 3.176758 16 1 20 3 60 A 0.83606903 9.482398 17 2 20 8 53 B 2.24291877 4.409831 18 3 20 10 91 C 0.06561494 2.485765 19 4 20 7 84 D 1.39370865 9.581816 20 5 20 10 86 E 1.40363146 7.764770 > str(df) 'data.frame': 20 obs. of 7 variables: $ x1: num 1 2 3 4 5 1 2 3 4 5 ... $ x2: num 5 5 5 5 5 10 10 10 10 10 ... $ x3: int 3 3 10 2 6 5 4 6 9 10 ... $ x4: int 9 83 36 78 81 43 76 15 32 7 ... $ x5: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 1 2 3 4 5 ... $ x6: num 0.181 1.685 0.68 -0.312 0.4 ... $ x7: num 3.5 8.26 2.75 5.73 6.09 ...
Here, we have one Factor variable. Now suppose that we want to convert x1, x2, and x3 to a factor variable then it can be done as follows −
> Factors<-c("x1","x2","x3")
> df[Factors]<-lapply(df[Factors],factor)
Checking whether x1, x2, and x3 are factor variables or not −
> str(df) 'data.frame': 20 obs. of 7 variables: $ x1: Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5 ... $ x2: Factor w/ 4 levels "5","10","15",..: 1 1 1 1 1 2 2 2 2 2 ... $ x3: Factor w/ 9 levels "2","3","4","5",..: 2 2 9 1 5 4 3 5 8 9 ... $ x4: int 9 83 36 78 81 43 76 15 32 7 ... $ x5: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 1 2 3 4 5 ... $ x6: num 0.181 1.685 0.68 -0.312 0.4 ... $ x7: num 3.5 8.26 2.75 5.73 6.09 ...
