How to create group names for consecutively duplicate values in an R data frame column?


The grouping of values can be done in many ways and one such way is if we have duplicate values or unique values then the group can be set based on that. If all the values are unique then there is no sense for grouping but if we have varying values then the grouping can be done. For this purpose, we can use rleid function as shown in the below examples.

Example1

Consider the below data frame −

Live Demo

> x<-sample(0:2,20,replace=TRUE)
> df1<-data.frame(x)
> df1

Output

   x
1  2
2  1
3  2
4  2
5  1
6  0
7  1
8  1
9  1
10 1
11 0
12 0
13 1
14 2
15 1
16 0
17 1
18 0
19 1
20 2

Creating the groups for values in x −

> df1$Grp<-paste0("Grp",rleid(df1$x))
> df1

Output

   x Grp
1  2 Grp1
2  1 Grp2
3  2 Grp3
4  2 Grp3
5  1 Grp4
6  0 Grp5
7  1 Grp6
8  1 Grp6
9  1 Grp6
10 1 Grp6
11 0 Grp7
12 0 Grp7
13 1 Grp8
14 2 Grp9
15 1 Grp10
16 0 Grp11
17 1 Grp12
18 0 Grp13
19 1 Grp14
20 2 Grp15

Example2

Live Demo

> y<-sample(0:1,20,replace=TRUE)
> df2<-data.frame(y)
> df2

Output

   y
1  0
2  1
3  0
4  1
5  1
6  1
7  0
8  0
9  0
10 1
11 0
12 0
13 0
14 0
15 0
16 1
17 1
18 1
19 1
20 0

Creating the groups for values in y −

> df2$Category<-paste0("Category#",rleid(df2$y))
> df2

Output

   y Category
1  0 Category#1
2  1 Category#2
3  0 Category#3
4  1 Category#4
5  1 Category#4
6  1 Category#4
7  0 Category#5
8  0 Category#5
9  0 Category#5
10 1 Category#6
11 0 Category#7
12 0 Category#7
13 0 Category#7
14 0 Category#7
15 0 Category#7
16 1 Category#8
17 1 Category#8
18 1 Category#8
19 1 Category#8
20 0 Category#9

Updated on: 05-Mar-2021

305 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements