- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to subset factor columns in an R data frame?
Subsetting of factor columns can be done by creating an object of all columns using sapply with is.factor to extract only factor column in the future then passing that object into subsetting operator single square brackets. For example, if we have a data frame df that contains three columns x, y, z and two of them say x and y are factor columns then we can use Factors<-sapply(df,is.factor) and then use df[,Factors], this will subset only factor columns in the data frame df.
Example
Consider the below data frame −
x1<-as.factor(sample(LETTERS[1:3],20,replace=TRUE)) x2<-as.factor(sample(c("GRP1","GRP2","GRP3","GRP4","GRP5"),20,replace=TRUE)) x3<-sample(1:10,20,replace=TRUE) df1<-data.frame(x1,x2,x3) df1
Output
x1 x2 x3 1 A GRP1 2 2 B GRP1 7 3 B GRP3 1 4 A GRP4 8 5 B GRP2 8 6 A GRP3 6 7 C GRP1 8 8 B GRP3 9 9 B GRP5 1 10 C GRP3 8 11 A GRP3 1 12 C GRP1 1 13 B GRP1 10 14 C GRP1 7 15 C GRP3 10 16 C GRP2 4 17 C GRP2 1 18 B GRP1 2 19 C GRP3 10 20 A GRP2 3
Creating an object of columns using sapply to extract the factor columns using single square brackets −
Example
Factors<-sapply(df1,is.factor) Factors
Output
x1 x2 x3 TRUE TRUE FALSE
Extracting factor columns −
Example
Factors_df1<-df1[,Factors] Factors_df1
Output
x1 x2 1 A GRP1 2 B GRP1 3 B GRP3 4 A GRP4 5 B GRP2 6 A GRP3 7 C GRP1 8 B GRP3 9 B GRP5 10 C GRP3 11 A GRP3 12 C GRP1 13 B GRP1 14 C GRP1 15 C GRP3 16 C GRP2 17 C GRP2 18 B GRP1 19 C GRP3 20 A GRP2
Let’s have a look at another example −
Example
Salary_Grp<-as.factor(sample(c("20-30","31-40","41-50"),20,replace=TRUE)) Gender<-as.factor(sample(c("Male","Female"),20,replace=TRUE)) Rating<-sample(0:10,20,replace=TRUE) df2<-data.frame(Salary_Grp,Gender,Rating) df2
Output
Salary_Grp Gender Rating 1 20-30 Male 7 2 20-30 Female 8 3 31-40 Male 5 4 41-50 Male 7 5 41-50 Male 6 6 20-30 Male 7 7 20-30 Female 0 8 20-30 Male 5 9 31-40 Female 2 10 20-30 Male 7 11 31-40 Male 8 12 31-40 Female 4 13 20-30 Male 9 14 20-30 Female 5 15 31-40 Male 0 16 20-30 Female 9 17 41-50 Female 10 18 31-40 Female 1 19 31-40 Male 5 20 20-30 Female 3
Example
Factors_df2<-sapply(df2,is.factor) Factors_df2
Output
Salary_Grp Gender Rating TRUE TRUE FALSE
Example
Factors_df2<-df2[,Factors_df2] Factors_df2
Output
Salary_Grp Gender 1 20-30 Male 2 20-30 Female 3 31-40 Male 4 41-50 Male 5 41-50 Male 6 20-30 Male 7 20-30 Female 8 20-30 Male 9 31-40 Female 10 20-30 Male 11 31-40 Male 12 31-40 Female 13 20-30 Male 14 20-30 Female 15 31-40 Male 16 20-30 Female 17 41-50 Female 18 31-40 Female 19 31-40 Male 20 20-30 Female
Advertisements