How to subset non-duplicate values from an R data frame column?

R Programming Server Side Programming Programming

Generally, the duplicate values are considered after first occurrence but the first occurrence of a value is also a duplicate of the remaining. Therefore, we might want to exclude that as well.

The subsetting of non-duplicate values from an R data frame column can be easily done with the help of duplicated function with negation operator as shown in the below Examples.

Example 1

Following snippet creates a sample data frame −

x<-rpois(20,10)
df1<-data.frame(x)
df1

The following dataframe is created

To subset the non-duplicate values from x with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −

x<-rpois(20,10)
df1<-data.frame(x)
df1$x[!(duplicated(df1$x)|duplicated(df1$x,fromLast=TRUE))]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 5 17 6 14 13 15 4 9

Example 2

Following snippet creates a sample data frame −

y<-sample(1:10,20,replace=TRUE)
df2<-data.frame(y)
df2

The following dataframe is created

To subset the non-duplicate values from y with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −

y<-sample(1:10,20,replace=TRUE)
df2<-data.frame(y)
df2
df2$y[!(duplicated(df2$y)|duplicated(df2$y,fromLast=TRUE))]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 8 4

Example 3

Following snippet creates a sample data frame −

z<-sample(501:510,20,replace=TRUE)
df3<-data.frame(z)
df3

The following dataframe is created

To subset the non-duplicate values from y with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −

z<-sample(501:510,20,replace=TRUE)
df3<-data.frame(z)
df3$z[!(duplicated(df3$z)|duplicated(df3$z,fromLast=TRUE))]

Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 509 504 503

Nizamuddin Siddiqui

Updated on: 01-Nov-2021

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started