- Related Questions & Answers
- How to collapse factor levels in an R data frame?
- How to extract the factor levels from factor column in an R data frame?
- How to create scatterplot for factor levels in an R data frame?
- How to drop factor levels in subset of a data frame in R?
- How to find the cumulative sum for factor levels in an R data frame?
- How to sum a variable by factor levels in R?
- How to find the sum by distinct column for factor levels in an R data frame?
- How to rename the factor levels of a factor variable by using mutate of dplyr package in R?
- How to select the first row for each level of a factor variable in an R data frame?
- How to convert a data frame to a matrix if the data frame contains factor variable as strings in R?
- How to subset factor columns in an R data frame?
- How to find the two factor interaction variables in an R data frame?
- How to set a level of a factor column in an R data frame to NA?
- How to create a subset for a factor level in an R data frame?
- How to find the frequency table for factor columns in an R data frame?

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

An R data frame can have numeric as well as factor variables. It has been seen that, factor levels in the raw data are recorded as synonyms even in different language versions but it is rare. For example, a factor variable can have hot and cold as levels but it is possible that hot is recorded as garam by a Hindi native speaker because garam is Hindi form of hot. Therefore, we need to combine the similar levels into one so that we do not have unnecessary factor levels for a variable.

Consider the below data frame −

set.seed(109) x1<-rep(c("Sweet","Meetha","Bitter","Salty"),times=5) x2<-sample(1:100,20) x3<-rpois(20,5) df1<-data.frame(x1,x2,x3) df1

x1 x2 x3 1 Sweet 8 4 2 Meetha 22 6 3 Bitter 25 3 4 Salty 85 10 5 Sweet 90 13 6 Meetha 10 0 7 Bitter 55 7 8 Salty 92 7 9 Sweet 95 4 10 Meetha 31 4 11 Bitter 5 4 12 Salty 56 6 13 Sweet 32 4 14 Meetha 78 6 15 Bitter 16 10 16 Salty 48 9 17 Sweet 49 4 18 Meetha 35 4 19 Bitter 37 9 20 Salty 11 8

Since Meetha is the Hindi version of Sweet, we might want to convert Meetha to Sweet and it can be done as shown below −

levels(df1$x1)[levels(df1$x1)=="Meetha"] <-"Sweet" df1

x1 x2 x3 1 Sweet 8 4 2 Sweet 22 6 3 Bitter 25 3 4 Salty 85 10 5 Sweet 90 13 6 Sweet 10 0 7 Bitter 55 7 8 Salty 92 7 9 Sweet 95 4 10 Sweet 31 4 11 Bitter 5 4 12 Salty 56 6 13 Sweet 32 4 14 Sweet 78 6 15 Bitter 16 10 16 Salty 48 9 17 Sweet 49 4 18 Sweet 35 4 19 Bitter 37 9 20 Salty 11 8

Let’s have a look at another example −

ID <-1:20 Class<-rep(c("First","Second","Third","Fourth","One"),each=4) df2<-data.frame(ID,Class) df2

ID Class 1 1 First 2 2 First 3 3 First 4 4 First 5 5 Second 6 6 Second 7 7 Second 8 8 Second 9 9 Third 10 10 Third 11 11 Third 12 12 Third 13 13 Fourth 14 14 Fourth 15 15 Fourth 16 16 Fourth 17 17 One 18 18 One 19 19 One 20 20 One

levels(df2$Class)[levels(df2$Class)=="One"] <-"First" df2

ID Class 1 1 First 2 2 First 3 3 First 4 4 First 5 5 Second 6 6 Second 7 7 Second 8 8 Second 9 9 Third 10 10 Third 11 11 Third 12 12 Third 13 13 Fourth 14 14 Fourth 15 15 Fourth 16 16 Fourth 17 17 First 18 18 First 19 19 First 20 20 First

Advertisements