How to remove duplicate rows and sort based on a numerical column an R data frame?


If we have duplicate rows in an R data frame then we can remove them by using unique function with data frame object name. And if we want to order the data frame with duplicate rows based on a numerical column then firstly unique rows should be found then order function can be used for sorting as shown in the below examples.

Example

Consider the below data frame −

 Live Demo

x1<-rep(c(2,7,1,5),5)
x2<-rep(LETTERS[1:4],5)
df1<-data.frame(x1,x2)
df1

Output

  x1 x2
1  2 A
2  7 B
3  1 C
4  5 D
5  2 A
6  7 B
7  1 C
8  5 D
9  2 A
10 7 B
11 1 C
12 5 D
13 2 A
14 7 B
15 1 C
16 5 D
17 2 A
18 7 B
19 1 C
20 5 D

Finding unique rows of df1 −

Example

df1<-unique(df1)
df1

Output

 x1 x2
1 2 A
2 7 B
3 1 C
4 5 D

Ordering df1 based on x1 −

Example

df1[order(df1$x1),]

Output

 x1 x2
3 1 C
1 2 A
4 5 D
2 7 B

Example

 Live Demo

y1<-rep(c(501,278,357,615),5)
y2<-rep(c("G1","G2","G3","G4"),5)
df2<-data.frame(y1,y2)
df2

Output

    y1 y2
1  501 G1
2  278 G2
3  357 G3
4  615 G4
5  501 G1
6  278 G2
7  357 G3
8  615 G4
9  501 G1
10 278 G2
11 357 G3
12 615 G4
13 501 G1
14 278 G2
15 357 G3
16 615 G4
17 501 G1
18 278 G2
19 357 G3
20 615 G4

Finding unique rows of df2 −

Example

df2<-unique(df2)
df2

Output

   y1 y2
1 501 G1
2 278 G2
3 357 G3
4 615 G4

Ordering df2 based on y1 −

Example

df2[order(df2$y1),]

Output

   y1 y2
2 278 G2
3 357 G3
1 501 G1
4 615 G4

Updated on: 07-Dec-2020

539 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements