How to create a subset for a factor level in an R data frame?


In data analysis, we often deal with factor variables and these factor variables have different levels. Sometimes, we want to create subset of the data frame in R for specific factor levels to analyze the data only for that particular level of the factor variable. This can be simply done by using subset function.

Example

Consider the below data frame −

> set.seed(99)
> Factor<-rep(c("India","China","USA","UK","Canada"),times=4)
> Percentage<-sample(1:100,20)
> df<-data.frame(Factor,Percentage)
> df
  Factor Percentage
1   India 48
2   China 33
3     USA 44
4      UK 22
5  Canada 62
6   India 32
7   China 13
8     USA 20
9      UK 31
10 Canada 68
11   India 9
12  China 82
13    USA 88
14     UK 30
15 Canada 86
16  India 84
17  China 95
18    USA 14
19   UK 4
20 Canada 78

Here, we have five levels of factor variable Factor. Now suppose we want to create a subset of Percentage for each of these levels then it can be done as shown below −

> India<-subset(df,Factor=="India")
> India
  Factor Percentage
 1 India 48
 6 India 32
11 India  9
16 India 84
> UK<-subset(df,Factor=="UK")
> UK
 Factor Percentage
 4 UK  22
 9 UK  31
14 UK  30
19 UK   4
> China<-subset(df,Factor=="China")
> China
  Factor Percentage
 2 China 33
 7 China 13
12 China 82
17 China 95
> USA<-subset(df,Factor=="USA")
> USA
Factor Percentage
 3 USA 44
 8 USA 20
13 USA 88
18 USA 14
> Canada<-subset(df,Factor=="Canada")
> Canada
Factor Percentage
5 Canada 62
10 Canada 68
15 Canada 86
20 Canada 78

Updated on: 12-Aug-2020

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements