How to filter data frame by categorical variable in R?


To filter data frame by categorical variable in R, we can follow the below steps −

  • Use inbuilt data sets or create a new data set and look at top few rows in the data set.

  • Then, look at the bottom few rows in the data set.

  • Check the data structure.

  • Filter the data by categorical column using split function.

Use inbuilt data set

Let’s consider CO2 data set in base R −

 Live Demo

data(CO2)
head(CO2,10)

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

Grouped Data: uptake ~ conc | Plant
  Plant Type Treatment conc uptake
1 Qn1 Quebec nonchilled 95 16.0
2 Qn1 Quebec nonchilled 175 30.4
3 Qn1 Quebec nonchilled 250 34.8
4 Qn1 Quebec nonchilled 350 37.2
5 Qn1 Quebec nonchilled 500 35.3
6 Qn1 Quebec nonchilled 675 39.2
7 Qn1 Quebec nonchilled 1000 39.7
8 Qn2 Quebec nonchilled 95 13.6
9 Qn2 Quebec nonchilled 175 27.3
10 Qn2 Quebec nonchilled 250 37.1

Look at few bottom rows

Use tail function to look at some bottom rows in CO2 data −

 Live Demo

data(CO2)
tail(CO2,10)

Output

Grouped Data: uptake ~ conc | Plant
    Plant  Type      Treatment conc uptake
75  Mc2 Mississippi chilled    500   12.5
76  Mc2 Mississippi chilled    675   13.7
77  Mc2 Mississippi chilled    1000  14.4
78  Mc3 Mississippi chilled    95    10.6
79  Mc3 Mississippi chilled    175   18.0
80  Mc3 Mississippi chilled    250   17.9
81  Mc3 Mississippi chilled    350   17.9
82  Mc3 Mississippi chilled    500   17.9
83  Mc3 Mississippi chilled    675   18.9
84  Mc3 Mississippi chilled    1000  19.9

Check the data structure

Use str function to check the data structure of data in CO2 −

Example

 Live Demo

data(CO2)
str(CO2)

Output

Classes ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame': 84 obs.
of 5 variables:
$ Plant    : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
$ Type     : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
$ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
$ conc     : num 95 175 250 350 500 675 1000 95 175 250 ...
$ uptake   : num 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
- attr(*, "formula")=Class 'formula' language uptake ~ conc | Plant
.. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
- attr(*, "outer")=Class 'formula' language ~Treatment * Type
.. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
- attr(*, "labels")=List of 2
..$ x: chr "Ambient carbon dioxide concentration"
..$ y: chr "CO2 uptake rate"
- attr(*, "units")=List of 2
..$ x: chr "(uL/L)"
..$ y: chr "(umol/m^2 s)"

Filter data frame by categorical column

Using split function to filter the data frame CO2 based on Type column −

 Live Demo

data(CO2)
split(CO2,CO2$Type)

Output

$Quebec
Grouped Data: uptake ~ conc | Plant
    Plant Type Treatment conc uptake
1  Qn1 Quebec nonchilled 95    16.0
2  Qn1 Quebec nonchilled 175   30.4
3  Qn1 Quebec nonchilled 250   34.8
4  Qn1 Quebec nonchilled 350   37.2
5  Qn1 Quebec nonchilled 500   35.3
6  Qn1 Quebec nonchilled 675   39.2
7  Qn1 Quebec nonchilled 1000  39.7
8  Qn2 Quebec nonchilled 95    13.6
9  Qn2 Quebec nonchilled 175   27.3
10 Qn2 Quebec nonchilled 250   37.1
11 Qn2 Quebec nonchilled 350   41.8
12 Qn2 Quebec nonchilled 500   40.6
13 Qn2 Quebec nonchilled 675   41.4
14 Qn2 Quebec nonchilled 1000  44.3
15 Qn3 Quebec nonchilled 95    16.2
16 Qn3 Quebec nonchilled 175   32.4
17 Qn3 Quebec nonchilled 250   40.3
18 Qn3 Quebec nonchilled 350   42.1
19 Qn3 Quebec nonchilled 500   42.9
20 Qn3 Quebec nonchilled 675   43.9
21 Qn3 Quebec nonchilled 1000  45.5
22 Qc1 Quebec  chilled     95  14.2
23 Qc1 Quebec  chilled    175  24.1
24 Qc1 Quebec  chilled    250  30.3
25 Qc1 Quebec  chilled    350  34.6
26 Qc1 Quebec  chilled    500  32.5
27 Qc1 Quebec  chilled    675  35.4
28 Qc1 Quebec  chilled   1000  38.7
29 Qc2 Quebec  chilled    95   9.3
30 Qc2 Quebec  chilled   175   27.3
31 Qc2 Quebec  chilled   250   35.0
32 Qc2 Quebec  chilled   350   38.8
33 Qc2 Quebec  chilled   500   38.6
34 Qc2 Quebec  chilled   675   37.5
35 Qc2 Quebec  chilled   1000  42.4
36 Qc3 Quebec  chilled   95    15.1
37 Qc3 Quebec  chilled   175   21.0
38 Qc3 Quebec  chilled   250   38.1
39 Qc3 Quebec  chilled   350   34.0
40 Qc3 Quebec  chilled   500   38.9
41 Qc3 Quebec  chilled   675   39.6
42 Qc3 Quebec  chilled   1000  41.4

$Mississippi
Grouped Data: uptake ~ conc | Plant
    Plant  Type     Treatment  conc uptake
43  Mn1 Mississippi nonchilled 95    10.6
44  Mn1 Mississippi nonchilled 175   19.2
45  Mn1 Mississippi nonchilled 250   26.2
46  Mn1 Mississippi nonchilled 350   30.0
47  Mn1 Mississippi nonchilled 500   30.9
48  Mn1 Mississippi nonchilled 675   32.4
49  Mn1 Mississippi nonchilled 1000  35.5
50  Mn2 Mississippi nonchilled 95    12.0
51  Mn2 Mississippi nonchilled 175   22.0
52  Mn2 Mississippi nonchilled 250   30.6
53  Mn2 Mississippi nonchilled 350   31.8
54  Mn2 Mississippi nonchilled 500   32.4
55  Mn2 Mississippi nonchilled 675   31.1
56  Mn2 Mississippi nonchilled 1000  31.5
57  Mn3 Mississippi nonchilled 95    11.3
58  Mn3 Mississippi nonchilled 175   19.4
59  Mn3 Mississippi nonchilled 250   25.8
60  Mn3 Mississippi nonchilled 350   27.9
61  Mn3 Mississippi nonchilled 500   28.5
62  Mn3 Mississippi nonchilled 675   28.1
63  Mn3 Mississippi nonchilled 1000   27.8
64  Mc1 Mississippi chilled    95     10.5
65  Mc1 Mississippi chilled    175    14.9
66  Mc1 Mississippi chilled    250    18.1
67  Mc1 Mississippi chilled    350    18.9
68  Mc1 Mississippi chilled    500    19.5
69  Mc1 Mississippi chilled    675    22.2
70  Mc1 Mississippi chilled    1000   21.9
71  Mc2 Mississippi chilled    95     7.7
72  Mc2 Mississippi chilled    175    11.4
73  Mc2 Mississippi chilled    250    12.3
74  Mc2 Mississippi chilled    350    13.0
75  Mc2 Mississippi chilled    500    12.5
76  Mc2 Mississippi chilled    675    13.7
77  Mc2 Mississippi chilled   1000    14.4
78  Mc3 Mississippi chilled    95     10.6
79  Mc3 Mississippi chilled    175    18.0
80  Mc3 Mississippi chilled    250    17.9
81  Mc3 Mississippi chilled    350    17.9
82  Mc3 Mississippi chilled    500    17.9
83  Mc3 Mississippi chilled    675    18.9
84  Mc3 Mississippi chilled    1000 19.9

Updated on: 14-Aug-2021

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements