How to replace space in a string value for some elements in a column of an R data frame?


Most of the times, the string data is in bad shape and we need to make it appropriate so that we can easily proceed with the analysis. There is also a situation in which a string column has some values where an extra space is used which was not required, therefore, it does not match with the rest of the column values. To remove these spaces, we can use lapply and gsub function.

Example

Consider the below data frame −

x1<-rep(c("A 1","A2","A 3","A4","A5"),times=4)
x2<-rep(c("#1","# 2","#3","#4"),each=5)
x3<-rep(c(5,7,8,12,15,18,22,24,31,39),times=2)
df<-data.frame(x1,x2,x3)
df
    x1 x2 x3
 1  A1 #1  5
 2  A2 #1  7
 3  A3 #1  8
 4  A4 #1 12
 5  A5 #1 15
 6  A1 #2 18
 7  A2 #2 22
 8  A3 #2 24
 9  A4 #2 31
10  A5 #2 39
11  A1 #3  5
12  A2 #3  7
13  A3 #3  8
14  A4 #3 12
15  A5 #3 15
16  A1 #4 18
17  A2 #4 22
18  A3 #4 24
19  A4 #4 31
20  A5 #4 39

Replacing space only in column 2 −

df[-c(1,3)] <- lapply(df[-c(1,3)], gsub, pattern = " ", replacement = "", fixed = TRUE)
df
   x1 x2 x3
 1 A1 #1 5
 2 A2 #1 7
 3 A3 #1 8
 4 A4 #1 12
 5 A5 #1 15
 6 A1 #2 18
 7 A2 #2 22
 8 A3 #2 24
 9 A4 #2 31
10 A5 #2 39
11 A1 #3  5
12 A2 #3  7
13 A3 #3  8
14 A4 #3 12
15 A5 #3 15
16 A1 #4 18
17 A2 #4 22
18 A3 #4 24
19 A4 #4 31
20 A5 #4 39

Replacing spaces for all columns −

df[] <- lapply(df, gsub, pattern = " ", replacement = "", fixed = TRUE)
df
   x1 x2 x3
 1 A1 #1  5
 2 A2 #1  7
 3 A3 #1  8
 4 A4 #1 12
 5 A5 #1 15
 6 A1 #2 18
 7 A2 #2 22
 8 A3 #2 24
 9 A4 #2 31
10 A5 #2 39
11 A1 #3  5
12 A2 #3  7
13 A3 #3  8
14 A4 #3 12
15 A5 #3 15
16 A1 #4 18
17 A2 #4 22
18 A3 #4 24
19 A4 #4 31
20 A5 #4 39

Updated on: 29-Aug-2020

311 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements