How to separate strings in R that are joined with special characters?

R ProgrammingServer Side ProgrammingProgramming

When we deal with text data it is difficult to make it clean and one of the most of basic problem with this type of data is that the values are separated with some unique characters such as special characters. For this purpose, we can use strsplit function that makes it easy to do the separation among text values. Check out the examples below to understand how it can be done.

Example

 Live Demo

x1<-"A-B-C-D-E-F-G-H-I-J-K-L-M-N-O-P-Q-R-S-T-U-V-W-X-Y-Z"
x1

Output

[1] "A-B-C-D-E-F-G-H-I-J-K-L-M-N-O-P-Q-R-S-T-U-V-W-X-Y-Z"

Example

strsplit(x1,"[-]")

Output

[[1]] [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"

Example

 Live Demo

x2<-"AK:AL:AR:AS:AZ:CA:CO:CT:DC:DE:FL:GA:GU:HI:IA:ID:IL:IN:KS:KY:LA:MA:MD:ME:MI:MN:MO:MP:MS:MT:NC:ND:NE:NH:NJ:NM:NV:NY:OH:OK:OR:PA:PR:RI:SC:SD:TN:TX:UM:UT:VA:VI:VT:WA:WI:WV:WY"
x2

Output

[1] "AK:AL:AR:AS:AZ:CA:CO:CT:DC:DE:FL:GA:GU:HI:IA:ID:IL:IN:KS:KY:LA:MA:M
D:ME:MI:MN:MO:MP:MS:MT:NC:ND:NE:NH:NJ:NM:NV:NY:OH:OK:OR:PA:PR:RI:SC:SD:TN:TX:UM:UT:VA:VI:VT:WA:WI:WV:WY"

Example

strsplit(x2,"[:]")

Output

[[1]] [1] "AK" "AL" "AR" "AS" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA" "GU" "HI" "IA"
[16] "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" "MI" "MN" "MO" "MP" "MS" "MT" 
[31] "NC" "ND" "NE" "NH" "NJ" "NM" "NV" "NY" "OH" "OK" "OR" "PA" "PR" "RI" "SC"
[46] "SD" "TN" "TX" "UM" "UT" "VA" "VI" "VT" "WA" "WI" "WV" "WY"

Example

 Live Demo

x3<-"AK/AL/AR/AS/AZ/CA/CO/CT/DC/DE/FL/GA/GU/HI/IA/ID/IL/IN/KS/KY/LA/MA/MD/ME/MI/MN/MO/MP/MS/MT/NC/ND/NE/NH/NJ/NM/NV/NY/OH/OK/OR/PA/PR/RI/SC/SD/TN/TX/UM/UT/VA/VI/VT/WA/WI/WV/WY"
x3

Output

[1] "AK/AL/AR/AS/AZ/CA/CO/CT/DC/DE/FL/GA/GU/HI/IA/ID/IL/IN/KS/KY/LA/MA/MD/ME/MI/MN/MO/MP/MS/MT/NC/ND/NE/NH/NJ/NM/NV/NY/OH/OK/OR/PA/PR/RI/SC/SD/TN/TX/UM/UT/VA/VI/VT/WA/WI/WV/WY"

Example

strsplit(x3,"[/]")

Output

[[1]] [1] "AK" "AL" "AR" "AS" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA" "GU" "HI" "IA"
[16] "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" "MI" "MN" "MO" "MP" "MS" "MT"
[31] "NC" "ND" "NE" "NH" "NJ" "NM" "NV" "NY" "OH" "OK" "OR" "PA" "PR" "RI" "SC"
[46] "SD" "TN" "TX" "UM" "UT" "VA" "VI" "VT" "WA" "WI" "WV" "WY"

Example

 Live Demo

x4<-"AK~AL~AR~AS~AZ~CA~CO~CT~DC~DE~FL~GA~GU~HI~IA~ID~IL~IN~KS~KY~LA~MA~MD~ME~MI~MN~MO~MP~MS~MT~NC~ND~NE~NH~NJ~NM~NV~NY~OH~OK~OR~PA~PR~RI~SC~SD~TN~TX~UM~UT~VA~VI~VT~WA~WI~WV~WY"
x4

Output

[1] "AK~AL~AR~AS~AZ~CA~CO~CT~DC~DE~FL~GA~GU~HI~IA~ID~IL~IN~KS~KY~LA~MA~MD~ME~MI~MN~MO~MP~MS~MT~NC~ND~NE~NH~NJ~NM~NV~NY~OH~OK~OR~PA~PR~RI~SC~SD~TN~TX~UM~UT~VA~VI~VT~WA~WI~WV~WY"

Example

strsplit(x4,"[~]")

Output

[[1]] [1] "AK" "AL" "AR" "AS" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA" "GU" "HI" "IA"
[16] "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" "MI" "MN" "MO" "MP" "MS" "MT"
[31] "NC" "ND" "NE" "NH" "NJ" "NM" "NV" "NY" "OH" "OK" "OR" "PA" "PR" "RI" "SC"
[46] "SD" "TN" "TX" "UM" "UT" "VA" "VI" "VT" "WA" "WI" "WV" "WY"

Example

 Live Demo

x5<-"AK*AL*AR*AS*AZ*CA*CO*CT*DC*DE*FL*GA*GU*HI*IA*ID*IL*IN*KS*KY*LA*MA*MD*ME*MI*MN*MO*MP*MS*MT*NC*ND*NE*NH*NJ*NM*NV*NY*OH*OK*OR*PA*PR*RI*SC*SD*TN*TX*UM*UT*VA*VI*VT*WA*WI*WV*WY"
x5

Output

[1] "AK*AL*AR*AS*AZ*CA*CO*CT*DC*DE*FL*GA*GU*HI*IA*ID*IL*IN*KS*KY*LA*MA*MD*ME*MI*MN*MO*MP*MS*MT*NC*ND*NE*NH*NJ*NM*NV*NY*OH*OK*OR*PA*PR*RI*SC*SD*TN*TX*UM*UT*VA*VI*VT*WA*WI*WV*WY"

Example

strsplit(x5,"[*]")

Output

[[1]] [1] "AK" "AL" "AR" "AS" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA" "GU" "HI" "IA"
[16] "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" "MI" "MN" "MO" "MP" "MS" "MT"
[31] "NC" "ND" "NE" "NH" "NJ" "NM" "NV" "NY" "OH" "OK" "OR" "PA" "PR" "RI" "SC"
[46] "SD" "TN" "TX" "UM" "UT" "VA" "VI" "VT" "WA" "WI" "WV" "WY"

Example

 Live Demo

x6<-c("AK*AL*AR*AS*AZ*CA","CO*CT*DC*DE*FL*GA","GU*HI*IA*ID*IL*IN*KS","KY*LA*MA*MD*ME*MI","MN*MO*MP*MS*MT*NC","ND*NE*NH*NJ*NM*NV","NY*OH*OK*OR*PA*PR","RI*SC*SD*TN*TX*UM","UT*VA*VI*VT","WA*WI*WV*WY")
x6

Output

[1] "AK*AL*AR*AS*AZ*CA" "CO*CT*DC*DE*FL*GA" "GU*HI*IA*ID*IL*IN*KS"
[4] "KY*LA*MA*MD*ME*MI" "MN*MO*MP*MS*MT*NC" "ND*NE*NH*NJ*NM*NV"
[7] "NY*OH*OK*OR*PA*PR" "RI*SC*SD*TN*TX*UM" "UT*VA*VI*VT"
[10] "WA*WI*WV*WY"

Example

strsplit(x6,"[*]")

Output

[[1]] [1] "AK" "AL" "AR" "AS" "AZ" "CA"
[[2]] [1] "CO" "CT" "DC" "DE" "FL" "GA"
[[3]] [1] "GU" "HI" "IA" "ID" "IL" "IN" "KS"
[[4]] [1] "KY" "LA" "MA" "MD" "ME" "MI"
[[5]] [1] "MN" "MO" "MP" "MS" "MT" "NC"
[[6]] [1] "ND" "NE" "NH" "NJ" "NM" "NV"
[[7]] [1] "NY" "OH" "OK" "OR" "PA" "PR"
[[8]] [1] "RI" "SC" "SD" "TN" "TX" "UM"
[[9]] [1] "UT" "VA" "VI" "VT"
[[10]] [1] "WA" "WI" "WV" "WY"
raja
Published on 16-Oct-2020 14:48:02
Advertisements