How to create a subset based on levels of a character column in R?

R Programming Server Side Programming Programming

In R programming, mostly the columns with string values can be either represented by character data type or factor data type. For example, if we have a column Group with four unique values as A, B, C, and D then it can be of character or factor with four levels. If we want to take the subset of these columns then subset function can be used. Check out the example below.

Consider the below data frame −

Example

set.seed(888)
Grp<-sample(c("A","B","C"),20,replace=TRUE)Age<-sample(21:50,20)
df1<-data.frame(Grp,Age)
df1

Output

str(df1) 'data.frame': 20 obs. of 2 variables:

$ Grp: chr "A" "C" "C" "C" ...
$ Age: int 35 40 48 46 36 33 47 45 43 37 ...

Taking subset of df1 based on Grp column values A and C −

Example

subset(df1, Grp %in% c("A","C"))

Output

Let’s have a look at another example −

Example

Live Demo

Class<-sample(c("First","Second","Third","Fourth"),20,replace=TRUE)
Score<-sample(1:10,20,replace=TRUE)
df2<-data.frame(Class,Score)
df2

Output

   Class  Score
1  First   10
2  First   3
3  First   1
4  First   7
5  First   1
6  Third   4
7  First   3
8  First   3
9  Second  2
10 First   8
11 Fourth  1
12 Third   6
13 First   6
14 Second  1
15 First   8
16 Fourth  4
17 Third   7
18 Fourth  4
19 Third   7
20 Fourth  1

str(df2) 'data.frame': 20 obs. of 2 variables:

$ Class: chr "First" "Third" "Second" "First" ...
$ Score: int 1 4 9 8 9 10 2 8 5 8 ...

Taking subset of df2 based on Class column values First and Fourth −

Example

subset(df2, Class %in% c("First","Fourth"))

Output

Class Score
1 First 1
4 First 8
5 First 9
6 Fourth 10
7 Fourth 2
9 Fourth 5
10 Fourth 8
11 Fourth 8
13 Fourth 7
14 Fourth 10
15 First 7
16 Fourth 10
17 Fourth 4
19 First 2
20 First 10

Nizamuddin Siddiqui

Updated on: 2020-10-09T15:22:03+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started