
- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to subset factor columns in an R data frame?
Subsetting of factor columns can be done by creating an object of all columns using sapply with is.factor to extract only factor column in the future then passing that object into subsetting operator single square brackets. For example, if we have a data frame df that contains three columns x, y, z and two of them say x and y are factor columns then we can use Factors<-sapply(df,is.factor) and then use df[,Factors], this will subset only factor columns in the data frame df.
Example
Consider the below data frame −
x1<-as.factor(sample(LETTERS[1:3],20,replace=TRUE)) x2<-as.factor(sample(c("GRP1","GRP2","GRP3","GRP4","GRP5"),20,replace=TRUE)) x3<-sample(1:10,20,replace=TRUE) df1<-data.frame(x1,x2,x3) df1
Output
x1 x2 x3 1 A GRP1 2 2 B GRP1 7 3 B GRP3 1 4 A GRP4 8 5 B GRP2 8 6 A GRP3 6 7 C GRP1 8 8 B GRP3 9 9 B GRP5 1 10 C GRP3 8 11 A GRP3 1 12 C GRP1 1 13 B GRP1 10 14 C GRP1 7 15 C GRP3 10 16 C GRP2 4 17 C GRP2 1 18 B GRP1 2 19 C GRP3 10 20 A GRP2 3
Creating an object of columns using sapply to extract the factor columns using single square brackets −
Example
Factors<-sapply(df1,is.factor) Factors
Output
x1 x2 x3 TRUE TRUE FALSE
Extracting factor columns −
Example
Factors_df1<-df1[,Factors] Factors_df1
Output
x1 x2 1 A GRP1 2 B GRP1 3 B GRP3 4 A GRP4 5 B GRP2 6 A GRP3 7 C GRP1 8 B GRP3 9 B GRP5 10 C GRP3 11 A GRP3 12 C GRP1 13 B GRP1 14 C GRP1 15 C GRP3 16 C GRP2 17 C GRP2 18 B GRP1 19 C GRP3 20 A GRP2
Let’s have a look at another example −
Example
Salary_Grp<-as.factor(sample(c("20-30","31-40","41-50"),20,replace=TRUE)) Gender<-as.factor(sample(c("Male","Female"),20,replace=TRUE)) Rating<-sample(0:10,20,replace=TRUE) df2<-data.frame(Salary_Grp,Gender,Rating) df2
Output
Salary_Grp Gender Rating 1 20-30 Male 7 2 20-30 Female 8 3 31-40 Male 5 4 41-50 Male 7 5 41-50 Male 6 6 20-30 Male 7 7 20-30 Female 0 8 20-30 Male 5 9 31-40 Female 2 10 20-30 Male 7 11 31-40 Male 8 12 31-40 Female 4 13 20-30 Male 9 14 20-30 Female 5 15 31-40 Male 0 16 20-30 Female 9 17 41-50 Female 10 18 31-40 Female 1 19 31-40 Male 5 20 20-30 Female 3
Example
Factors_df2<-sapply(df2,is.factor) Factors_df2
Output
Salary_Grp Gender Rating TRUE TRUE FALSE
Example
Factors_df2<-df2[,Factors_df2] Factors_df2
Output
Salary_Grp Gender 1 20-30 Male 2 20-30 Female 3 31-40 Male 4 41-50 Male 5 41-50 Male 6 20-30 Male 7 20-30 Female 8 20-30 Male 9 31-40 Female 10 20-30 Male 11 31-40 Male 12 31-40 Female 13 20-30 Male 14 20-30 Female 15 31-40 Male 16 20-30 Female 17 41-50 Female 18 31-40 Female 19 31-40 Male 20 20-30 Female
- Related Questions & Answers
- How to create a subset for a factor level in an R data frame?
- How to extract only factor columns name from an R data frame?
- How to create table of two factor columns in an R data frame?
- How to subset an R data frame by specifying columns that contains NA?
- How to drop factor levels in subset of a data frame in R?
- How to find the frequency table for factor columns in an R data frame?
- How to collapse factor levels in an R data frame?
- How to create a subset of an R data frame based on multiple columns?
- How to subset columns that has less than four categories in an R data frame?
- How to subset nth row from an R data frame?
- How to standardize columns in an R data frame?
- How to extract the factor levels from factor column in an R data frame?
- How to subset row values based on columns name in R data frame?
- How to find the cumulative sums by using two factor columns in an R data frame?
- How to subset an R data frame by ignoring a value in one of the columns?
Advertisements