- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to remove everything before values starting after underscore from column values of an R data frame?
If a column in an R data frame contain string values that are separated with an underscore and stretches the size of the column values that also contain common values then it would be wise to remove underscore sign from all the values at once along with the values that is common. This will help us to read the data properly as well as analysis will become easy. For this purpose, we can use gsub function
Consider the below data frame −
Example
set.seed(191) ID<-c("ID_1","ID_2","ID_3","ID_4","ID_5","ID_6","ID_7","ID_8","ID_9","ID_10","ID_11","ID_12","ID_13","ID_14","ID_15","ID_16","ID_17","ID_18","ID_19","ID_20") Salary<-sample(20000:50000,20) df1<-data.frame(ID,Salary) df1
Output
ID Salary 1 ID_1 33170 2 ID_2 22747 3 ID_3 42886 4 ID_4 22031 5 ID_5 45668 6 ID_6 32584 7 ID_7 34779 8 ID_8 20471 9 ID_9 38689 10 ID_10 29660 11 ID_11 49664 12 ID_12 24284 13 ID_13 36537 14 ID_14 37693 15 ID_15 30265 16 ID_16 36004 17 ID_17 48247 18 ID_18 20750 19 ID_19 27400 20 ID_20 20553
Removing everything before and including underscore sign from ID values in column ID −
Example
df1$ID<-gsub("^.*\_","",df1$ID) df1
Output
ID Salary 1 1 48769 2 2 26002 3 3 37231 4 4 24437 5 5 43311 6 6 47494 7 7 21029 8 8 28069 9 9 41108 10 10 29363 11 11 23371 12 12 25898 13 13 42434 14 14 22210 15 15 48969 16 16 21640 17 17 36175 18 18 21210 19 19 43374 20 20 29367
Let’s have a look at another example −
Example
Group<-c("GRP_1","GRP_2","GRP_3","GRP_4","GRP_5","GRP_6","GRP_7","GRP_8","GRP_9","GRP_10","GRP_11","GRP_12","GRP_13","GRP_14","GRP_15","GRP_16","GRP_17","GRP_18","GRP_19","GRP_20") Ratings<-sample(0:10,20,replace=TRUE) df2<-data.frame(Group,Ratings) df2
Output
Group Ratings 1 GRP_1 6 2 GRP_2 9 3 GRP_3 7 4 GRP_4 10 5 GRP_5 10 6 GRP_6 9 7 GRP_7 9 8 GRP_8 3 9 GRP_9 2 10 GRP_10 0 11 GRP_11 3 12 GRP_12 7 13 GRP_13 6 14 GRP_14 10 15 GRP_15 1 16 GRP_16 3 17 GRP_17 10 18 GRP_18 2 19 GRP_19 9 20 GRP_20 0
Removing everything before and including underscore sign from GRP values in column Group −
Example
df2$Group<-gsub("^.*\_","",df2$Group) df2
Output
Group Ratings 1 1 4 2 2 8 3 3 7 4 4 0 5 5 10 6 6 10 7 7 5 8 8 4 9 9 3 10 10 7 11 11 4 12 12 4 13 13 3 14 14 10 15 15 7 16 16 2 17 17 3 18 18 8 19 19 9 20 20 5
- Related Articles
- How to remove underscore from column names of an R data frame?
- How to remove rows from an R data frame based on frequency of values in grouping column?
- How to subset non-duplicate values from an R data frame column?
- How to find the difference between row values starting from bottom of an R data frame?
- How to subtract column values from column means in R data frame?
- How to remove a column from an R data frame?
- How to remove column names from an R data frame?
- How to change row values based on column values in an R data frame?
- How to fill NA values with previous values in an R data frame column?
- How to select positive values in an R data frame column?
- How to randomly replace values in an R data frame column?
- How to find the sum of column values of an R data frame?
- How to repeat column values in R data frame by values in another column?
- How to filter column values for some strings from an R data frame using dplyr?
- How to divide row values of a numerical column based on categorical column values in an R data frame?

Advertisements