How to count the number of duplicate rows in an R data frame?

R Programming Server Side Programming Programming

To count the number of duplicate rows in an R data frame, we would first need to convert the data frame into a data.table object by using setDT and then count the duplicates with Count function. For example, if we have a data frame called df then the duplicate rows will be counted by using the command − setDT(df)[,list(Count=.N),names(df)].

Example1

Live Demo

Consider the below data frame −

x1<−rpois(20,2)
x2<−rpois(20,2)
df1<−data.frame(x1,x2)
df1

Output

Loading data.table object −

Example

library(data.table)

Finding the duplicate rows −

Example

setDT(df1)[,list(Count=.N),names(df1)]

Output

  x1 x2 Count
1:  4  3 2
2:  3  3 1
3:  3  0 2
4:  2  0 2
5:  0  4 1
6:  1  1 3
7:  0  1 1
8:  3  2 2
9:  5  3 1
10: 1  3 1
11: 2  2 1
12: 3  1 2
13: 5  1 1

Example2

Live Demo

y1<−sample(0:2,20,replace=TRUE)
y2<−sample(0:2,20,replace=TRUE)
df2<−data.frame(y1,y2)
df2

Output

Finding the duplicate rows −

Example

setDT(df2)[,list(Count=.N),names(df2)]

Output

y1 y2 Count
1: 2 1 4
2: 2 2 3
3: 0 0 3
4: 0 2 3
5: 1 0 3
6: 1 2 1
7: 1 1 1
8: 0 1 1
9: 2 0 1

Nizamuddin Siddiqui

Updated on: 2021-02-09T11:43:02+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started