How to find the count of duplicate rows if they are greater than n in R data frame?

R Programming Server Side Programming Programming

To find the count of duplicate rows if they are greater than n in R data frame, we can follow the below steps −

First of all, create a data frame.
Then, count the duplicate rows if they are greater than a certain number using group_by_all, count, and filter function of dplyr package.

Create the data frame

Let's create a data frame as shown below −

x<-rpois(30,1)
y<-rpois(30,1)
df<-data.frame(x,y)
df

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

Count the duplicate rows if they are greater than a certain number

Loading dplyr package and using group_by_all, count, and filter function to find the count of duplicate rows if they are greater than 2 −

x<-rpois(30,1)
y<-rpois(30,1)
df<-data.frame(x,y)
library(dplyr)
df%>%group_by_all()%>%count()%>%filter(n>2)

Output

# A tibble: 7 x 3
# Groups: x, y [7]
     x     y     n
  <int> <int> <int>
1    0    0    4
2    0    1    3
3    0    2    3
4    1    0    4
5    1    1    3
6    1    2    4
7    2    1    3

Nizamuddin Siddiqui

Updated on: 2021-08-14T07:51:33+05:30

488 Views

Kickstart Your Career

Get certified by completing the course

Get Started