How to use column index instead of column name while using group_by of dplyr in R?


When we use group_by function of dplyr package, we need to pass the column name(s) that are categorical in nature. If we want to use the index of the same column(s) then group_by_at function needs to be used, where we can pass the column index as the argument.

Example1

 Live Demo

Consider the below data frame −

x1<−sample(LETTERS[1:4],20,replace=TRUE)
x2<−rpois(20,2)
df1<−data.frame(x1,x2)
df1

Output

x1 x2
1 D 4
2 D 5
3 B 2
4 D 3
5 C 1
6 C 3
7 D 1
8 D 3
9 B 3
10 B 2
11 C 0
12 C 1
13 A 2
14 B 2
15 B 2
16 C 4
17 D 2
18 A 0
19 D 0
20 B 2

Loading dplyr package and using column index instead of column name −

Example

library(dplyr)
df1%>%group_by_at(1)%>%summarise(n=n())
`summarise()` ungrouping output (override with `.groups` argument)

Output

# A tibble: 4 x 2
x1 n
< chr> <int>
1 A 2
2 B 6
3 C 5
4 D 7

Example2

 Live Demo

y1<−sample(c("Male","Female"),20,replace=TRUE)
y2<−sample(21:50,20)
df2<−data.frame(y1,y2)
df2

Output

y1 y2
1 Female 29
2 Male 43
3 Female 34
4 Male 49
5 Male 28
6 Female 23
7 Female 27
8 Female 31
9 Female 36
10 Female 41
11 Male 25
12 Female 24
13 Male 30
14 Female 22
15 Female 37
16 Male 42
17 Female 47
18 Male 35
19 Female 32
20 Female 21

Using column index instead of column name to summarise y1 −

Example

df2%>%group_by_at(1)%>%summarise(n=n())
`summarise()` ungrouping output (override with `.groups` argument)

Output

# A tibble: 2 x 2
y1 n
<chr> <int>
1 Female 13
2 Male 7

Updated on: 09-Feb-2021

676 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements