How to check if a data frame column contains duplicate values in R?

R Programming Server Side Programming Programming

To check if a data frame column contains duplicate values, we can use duplicated function along with any. For example, if we have a data frame called df that contains a column ID then we can check whether ID contains duplicate values or not by using the command −

any(duplicated(df$ID))

Example1

Consider the below data frame −

Live Demo

ID<-1:20
x<-rpois(20,1)
df1<-data.frame(ID,x)
df1

Output

Checking whether x contains any duplicate or not −

any(duplicated(df1$x))

[1] TRUE

Example2

Live Demo

S.No<-1:20
y<-round(rnorm(20,5,3),1)
df2<-data.frame(S.No,y)
df2

Output

   S.No  y
1   1   5.1
2   2   5.8
3   3   4.4
4   4  10.1
5   5   3.3
6   6   6.1
7   7   4.8
8   8  12.6
9   9   6.4
10 10   8.7
11 11   1.5
12 12   2.5
13 13   2.1
14 14   8.7
15 15   5.5
16 16   2.0
17 17   2.1
18 18   5.5
19 19   5.4
20 20   3.4

Checking whether y contains any duplicate or not −

any(duplicated(df2$y))

[1] TRUE

Nizamuddin Siddiqui

Updated on: 16-Mar-2021

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started