# How to subset non-duplicate values from an R data frame column?

Generally, the duplicate values are considered after first occurrence but the first occurrence of a value is also a duplicate of the remaining. Therefore, we might want to exclude that as well.

The subsetting of non-duplicate values from an R data frame column can be easily done with the help of duplicated function with negation operator as shown in the below Examples.

## Example 1

Following snippet creates a sample data frame −

x<-rpois(20,10)
df1<-data.frame(x)
df1


The following dataframe is created

    x
1  16
2   5
3  17
4   7
5   6
6   7
7  14
8  10
9   7
10 13
11 11
12 15
13  4
14 10
15 16
16 11
17 10
18 11
19  9
20 11

To subset the non-duplicate values from x with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −

x<-rpois(20,10)
df1<-data.frame(x)

## Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 8 4


## Example 3

Following snippet creates a sample data frame −

z<-sample(501:510,20,replace=TRUE)
df3<-data.frame(z)
df3

The following dataframe is created

     z
1  509
2  507
3  504
4  508
5  502
6  510
7  508
8  506
9  503
10 508
11 507
12 508
13 502
14 508
15 506
16 510
17 505
18 510
19 510
20 505

To subset the non-duplicate values from y with exclusion of first duplicate on the above created data frame, add the following code to the above snippet −

z<-sample(501:510,20,replace=TRUE)
df3<-data.frame(z)
df3$z[!(duplicated(df3$z)|duplicated(df3\$z,fromLast=TRUE))]


## Output

If you execute all the above given snippets as a single program, it generates the following Output −

[1] 509 504 503

Updated on: 01-Nov-2021

1K+ Views