How to drop factor levels in subset of a data frame in R?


There are two ways to do drop the factor levels in a subset of a data frame, first one is by using factor function and another is by using lapply.

Example

> df <- data.frame(alphabets=letters[1:10], numbers=seq(0:9))
> levels(df$alphabets)
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
> subdf <- subset(df, numbers <= 6)
> subdf
alphabets numbers
1 a 1
2 b 2
3 c 3
4 d 4
5 e 5
6 f 6
> levels(subdf$alphabets)
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"

Although we have created a subset but the level of factor variable alphabets still showing 10 levels. If we want to drop the factor levels then it can be done by

Using factor function

> subdf$alphabets <- factor(subdf$alphabets)
> levels(subdf$alphabets)
[1] "a" "b" "c" "d" "e" "f"

Using lapply

> subdf[] <- lapply(subdf, function(x) if(is.factor(x)) factor(x) else x)
> levels(subdf$alphabets)
[1] "a" "b" "c" "d" "e" "f"

Updated on: 06-Jul-2020

486 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements