How to count special characters in an R vector?


Special characters are generally treated as string values and they can be counted with the help of str_count function of stringr package. For example, if we have a vector x that contains $, #, %, ^, &, *, @, ! or any other special character then we can use str_count(x,"\$") to count the number of $ in the vector x, this can be done for all special characters separately.

Example1

x1<−c("Alabama$Alaska$American Samoa$Arizona$Arkansas$California$Colorado$Connecticut$Delaware$District of Columbia$Florida$Georgia$Guam$Hawaii$Idaho$Illinois$Indiana$Iowa$Kansas$Kentucky$Louisiana$Maine$Maryland$Massachusetts$Michigan$Minnesota$Minor Outlying Islands$Mississippi$Missouri$Montana$Nebraska$Nevada$New Hampshire$New Jersey$New Mexico$New York$North Carolina$North Dakota$Northern Mariana Islands$Ohio$Oklahoma$Oregon$Pennsylvania$Puerto Rico$Rhode Island$South Carolina$South Dakota$Tennessee$Texas$U.S. Virgin Islands$Utah$Vermont$Virginia$Washington$West Virginia$Wisconsin$Wyoming")
x1
[1] "Alabama$Alaska$American Samoa$Arizona$Arkansas$California$Colorado$Connecticut$Delaware$District of Columbia$Florida$Georgia$Guam$Hawaii$Idaho$Illinois$Indiana$Iowa$Kansas$Kentucky$Louisiana$Maine$Maryland$Massachusetts$Michigan$Minnesota$Minor Outlying Islands$Mississippi$Missouri$Montana$Nebraska$Nevada$New Hampshire$New Jersey$New Mexico$New York$North Carolina$North Dakota$Northern Mariana Islands$Ohio$Oklahoma$Oregon$Pennsylvania$Puerto Rico$Rhode Island$South Carolina$South Dakota$Tennessee$Texas$U.S. Virgin Islands$Utah$Vermont$Virginia$Washington$West Virginia$Wisconsin$Wyoming"

Counting number of $ in x1 −

str_count(x1,"\$")
[1] 56

Example2

x2<−c("Alabama # Alaska # American Samoa # Arizona # Arkansas # California # Colorado # Connecticut # Delaware # District of Columbia # Florida$Georgia$Guam$Hawaii$Idaho$Illinois$Indiana$Iowa$Kansas$Kentucky$Louisiana$Maine$Maryland$Massachusetts$Michigan$Minnesota$Minor Outlying Islands$Mississippi$Missouri$Montana$Nebraska$Nevada$New Hampshire$New Jersey$New Mexico$New York$North Carolina$North Dakota$Northern Mariana Islands$Ohio$Oklahoma$Oregon$Pennsylvania$Puerto Rico$Rhode Island$South Carolina$South Dakota$Tennessee # Texas # U.S. Virgin Islands # Utah# Vermont # Virginia # Washington # West Virginia # Wisconsin # Wyoming")

Counting number of # in x2 −

str_count(x2,"\#")
[1] 19

Example3

x3<−c("AK * AL", "AR * AS", "AZ * CA", "CO * * CT", "DC * * * * DE", "FL * * * * * * * * * * * * * * * *GA", "GU * * HI", "IA * * * ID", "IL", "IN", "KS", "KY * * LA", "MA * * MD", "ME * * * * MI", "MN * * * MO", "MP * * * * MS", "MT * * * * * * NC", "ND * * * NE", "NH * * * * * NJ", "NM * NV", "NY * * * * OH", "OK * * * * OR", "PA * * * * * * * PR", "RI * * SC", "SD * TN", "TX * * * * UM", "UT * * * * * * * * * VA", "VI * * * VT", "WA * * * * * * WI", "WV * WY")
x3
[1] "AK * AL"
[2] "AR * AS"
[3] "AZ * CA"
[4] "CO * * CT"
[5] "DC * * * * DE"
[6] "FL * * * * * * * * * * * * * * * *GA"
[7] "GU * * HI"
[8] "IA * * * ID"
[9] "IL"
[10] "IN"
[11] "KS"
[12] "KY * * LA"
[13] "MA * * MD"
[14] "ME * * * * MI"
[15] "MN * * * MO"
[16] "MP * * * * MS"
[17] "MT * * * * * * NC"
[18] "ND * * * NE"
[19] "NH * * * * * NJ"
[20] "NM * NV"
[21] "NY * * * * OH"
[22] "OK * * * * OR"
[23] "PA * * * * * * * PR"
[24] "RI * * SC"
[25] "SD * TN"
[26] "TX * * * * UM"
[27] "UT * * * * * * * * * VA"
[28] "VI * * * VT"
[29] "WA * * * * * * WI"
[30] "WV * WY"
str_count(x3,"\*")
[1] 1 1 1 2 4 16 2 3 0 0 0 2 2 4 3 4 6 3 5 1 4 4 7 2 1
[26] 4 9 3 6 1

Example4

 Live Demo

x4<−c("A / / / // / / // / / / / / / / // / / // / /B","C/ / D","E F G / / / / / / / / /H","I // J ///// / / / K / / / / L","M N O P Q R /// S T U V //// / W X // Y /Z")
x4
[1] "A / / / // / / // / / / / / / / // / / // / /B"
[2] "C/ / D"
[3] "E F G / / / / / / / / /H"
[4] "I // J ///// / / / K / / / / L"
[5] "M N O P Q R /// S T U V //// / W X // Y /Z"
str_count(x4,"\/")
[1] 24 2 9 14 11

Updated on: 17-Oct-2020

695 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements