How to replace $ sign combined with some specific values in an R data frame?

R ProgrammingServer Side ProgrammingProgramming

Sometimes we get very dirty data and that is the reason data analysis is a difficult task. Most of the data scientists look for clean data but it is almost impossible due to data warehouses often just focus on the data availability instead of the quality of data. One of the head scratching situations is getting an unnecessary value placed at different position in a random manner, $ sign is also a that type of value. We can remove this from an R data frame by using lapply function.

Example

Consider the below data frame:

Live Demo

> x<-sample(c("A","$B","C"),20,replace=TRUE)
> y<-sample(c("I","II","$II"),20,replace=TRUE)
> df1<-data.frame(x,y)
> df1

Output

x y
1 C $II
2 C II
3 A I
4 $B $II
5 $B $II
6 A I
7 A $II
8 C I
9 $B II
10 $B II
11 C $II
12 A II
13 $B II
14 C I
15 C $II
16 C I
17 C II
18 $B I
19 $B II
20 C $II

Removing $ sign from every place in df1:

Example

> df1<-lapply(df1,gsub,pattern='\\$',replacement='')
> df1
$x

Output

[1] "C" "C" "A" "B" "B" "A" "A" "C" "B" "B" "C" "A" "B" "C" "C" "C" "C" "B" "B"
[20] "C"

Example

$y

Output

[1] "II" "II" "I" "II" "II" "I" "II" "I" "II" "II" "II" "II" "II" "I" "II"
[16] "I" "II" "I" "II" "II"

Let’s have a look at another example:

Example

Live Demo

> Price<-sample(c("1$","2$","3$","4$"),20,replace=TRUE)
> Group<-sample(c("$First","$Second","Third"),20,replace=TRUE)
> df2<-data.frame(Price,Group)
> df2

Output

Price Group
1 3$ $Second
2 2$ Third
3 1$ Third
4 2$ $Second
5 2$ $First
6 4$ $First
7 2$ $First
8 3$ $First
9 2$ Third
10 4$ Third
11 3$ $First
12 3$ Third
13 3$ $Second
14 2$ $First
15 4$ Third
16 3$ $First
17 4$ Third
18 2$ $First
19 2$ $Second
20 3$ Third

Removing $ sign from every place in df2:

Example

> df2<-lapply(df2,gsub,pattern='\\$',replacement='')
> df2

Output

$Price
[1] "3" "2" "1" "2" "2" "4" "2" "3" "2" "4" "3" "3" "3" "2" "4" "3" "4" "2" "2"
[20] "3"

$Group
[1] "Second" "Third" "Third" "Second" "First" "First" "First" "First"
[9] "Third" "Third" "First" "Third" "Second" "First" "Third" "First"
[17] "Third" "First" "Second" "Third"
raja
Published on 19-Nov-2020 08:13:36
Advertisements