How to remove duplicate rows in an R data frame if exists in two columns?


If two values are repeated in a column that means there are many same values in that column but if those values are repeated in column as well as rows then they are called duplicated rows in two columns. To remove duplicate rows in an R data frame if exists in two columns, we can use duplicated function as shown in the below examples.

Consider the below data frame −

Example

 Live Demo

x1<-sample(LETTERS[1:4],20,replace=TRUE)
x2<-sample(LETTERS[1:4],20,replace=TRUE)
df1<-data.frame(x1,x2)
df1

Output

   x1  x2
1  B   B
2  C   D
3  A   A
4  C   D
5  B   C
6  D   D
7  D   A
8  A   B
9  B   A
10 D   B
11 A   B
12 B   B
13 D   A
14 A   C
15 C   A
16 A   B
17 A   B
18 A   C
19 D   A
20 B   B

Removing duplicate rows if exists in two columns of df1 −

Example

df1[!duplicated(df1[c("x1","x2")]),]

Output

   x1 x2
1  B  B
2  C  D
3  A  A
5  B  C
6  D  D
7  D  A
8  A  B
9  B  A
10 D  B
14 A  C
15 C  A

Example

 Live Demo

y1<-rpois(20,1)
y2<-rpois(20,1)
y3<-rpois(20,1)
df2<-data.frame(y1,y2,y3)
df2

Output

   y1 y2 y3
1  0  2  1
2  1  1  0
3  0  1  0
4  0  2  2
5  0  2  0
6  0  0  1
7  0  0  0
8  1  0  1
9  3  0  0
10 0  2  0
11 0  2  1
12 1  2  1
13 0  0  1
14 2  2  0
15 3  3  3
16 0  1  1
17 0  0  1
18 1  0  0
19 0  1  1
20 0  1  3

Removing duplicate rows if exists in two columns of df2 −

Example

df2[!duplicated(df2[c("y1","y2")]),]

Output

   y1 y2 y3
1  0  2  1
2  1  1  0
3  0  1  0
6  0  0  1
8  1  0  1
9  3  0  0
12 1  2  1
14 2  2  0
15 3  3  3

Updated on: 08-Feb-2021

914 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements