- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to count the number of duplicate rows in an R data frame?
To count the number of duplicate rows in an R data frame, we would first need to convert the data frame into a data.table object by using setDT and then count the duplicates with Count function. For example, if we have a data frame called df then the duplicate rows will be counted by using the command − setDT(df)[,list(Count=.N),names(df)].
Example1
Consider the below data frame −
x1<−rpois(20,2) x2<−rpois(20,2) df1<−data.frame(x1,x2) df1
Output
x1 x2 1 4 3 2 3 3 3 3 0 4 3 0 5 2 0 6 2 0 7 0 4 8 1 1 9 4 3 10 0 1 11 3 2 12 5 3 13 1 1 14 3 2 15 1 3 16 2 2 17 3 1 18 1 1 19 5 1 20 3 1
Loading data.table object −
Example
library(data.table)
Finding the duplicate rows −
Example
setDT(df1)[,list(Count=.N),names(df1)]
Output
x1 x2 Count 1: 4 3 2 2: 3 3 1 3: 3 0 2 4: 2 0 2 5: 0 4 1 6: 1 1 3 7: 0 1 1 8: 3 2 2 9: 5 3 1 10: 1 3 1 11: 2 2 1 12: 3 1 2 13: 5 1 1
Example2
y1<−sample(0:2,20,replace=TRUE) y2<−sample(0:2,20,replace=TRUE) df2<−data.frame(y1,y2) df2
Output
y1 y2 1 2 1 2 2 2 3 0 0 4 2 2 5 0 2 6 2 2 7 1 0 8 0 2 9 1 0 10 2 1 11 1 2 12 0 2 13 1 0 14 0 0 15 2 1 16 1 1 17 0 0 18 0 1 19 2 1 20 2 0
Finding the duplicate rows −
Example
setDT(df2)[,list(Count=.N),names(df2)]
Output
y1 y2 Count 1: 2 1 4 2: 2 2 3 3: 0 0 3 4: 0 2 3 5: 1 0 3 6: 1 2 1 7: 1 1 1 8: 0 1 1 9: 2 0 1
Advertisements