Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to remove continuously repeated duplicates in an R data frame column?
Often values are repeated that generate duplication in the data and we might want to get rid of those values if they are not likely to create bias in the output of the analysis. For example, if we have a column that defines a process and we take the output of the process five times but it takes the same output all the time then we might want to use only one output.
Example1
ID<−1:20 z<−sample(11:13,20,replace=TRUE) df3<−data.frame(ID,z) df3
Output
ID z 1 1 12 2 2 13 3 3 13 4 4 13 5 5 11 6 6 12 7 7 12 8 8 13 9 9 12 10 10 13 11 11 13 12 12 12 13 13 12 14 14 13 15 15 13 16 16 13 17 17 12 18 18 12 19 19 12 20 20 13
Removing continuously repeated duplicates in df3 column z −
Repeated3<−cumsum(rle(as.character(df3$z))$length) df3[Repeated3,]
Output
ID z 1 1 12 4 4 13 5 5 11 7 7 12 8 8 13 9 9 12 11 11 13 13 13 12 16 16 13 19 19 12 20 20 13
Advertisements
