How to find the count of a particular character in a string vector in R?


To find the count of a particular character in a string vector we can use nchar function along with gsub. For example, if we have a vector called x that contains string such India, Russia, Indonesia then we can find the number of times character i occurred then we can use the command nchar(gsub("[^i]","",x)) and the output will be 1 1 1 because first I’s in India and Indonesia will not be considered as they are in uppercase.

Example1

 Live Demo

x1<-sample(c("India","Russia","Croatia","Indonesia","China"),100,replace=TRUE)
x1

Output

[1] "Russia" "India" "Indonesia" "Russia" "Russia" "Indonesia"
[7] "Indonesia" "Russia" "Russia" "Russia" "Indonesia" "China"
[13] "India" "India" "Indonesia" "Indonesia" "India" "Croatia"
[19] "India" "Indonesia" "China" "India" "China" "Russia"
[25] "China" "China" "China" "Indonesia" "Russia" "India"
[31] "India" "Russia" "India" "Croatia" "India" "China"
[37] "China" "India" "Indonesia" "Russia" "Croatia" "China"
[43] "Russia" "Croatia" "Russia" "Indonesia" "Russia" "Indonesia"
[49] "Russia" "Russia" "Russia" "China" "Indonesia" "Indonesia"
[55] "India" "Russia" "Croatia" "India" "Indonesia" "China"
[61] "Indonesia" "Indonesia" "Croatia" "Russia" "Russia" "Russia"
[67] "Croatia" "Indonesia" "China" "Indonesia" "India" "Indonesia"
[73] "China" "India" "Croatia" "Indonesia" "Russia" "China"
[79] "India" "Russia" "Indonesia" "India" "India" "Croatia"
[85] "Russia" "Croatia" "Croatia" "Croatia" "Russia" "Russia"
[91] "Indonesia" "Indonesia" "Croatia" "India" "Indonesia" "Indonesia"
[97] "China" "China" "China" "Indonesia"

nchar(gsub("[^R]","",x1))

[1] 1 0 0 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0
[38] 0 0 1 0 0 1 0 1 0 1 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0
[75] 0 0 1 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0

nchar(gsub("[^s]","",x1))

[1] 2 0 1 2 2 1 1 2 2 2 1 0 0 0 1 1 0 0 0 1 0 0 0 2 0 0 0 1 2 0 0 2 0 0 0 0 0
[38] 0 1 2 0 0 2 0 2 1 2 1 2 2 2 0 1 1 0 2 0 0 1 0 1 1 0 2 2 2 0 1 0 1 0 1 0 0
[75] 0 1 2 0 0 2 1 0 0 0 2 0 0 0 2 2 1 1 0 0 1 1 0 0 0 1

Example2

 Live Demo

x2<-sample(c("Asia","Oceania","Africa","Europe","America"),100,replace=TRUE)
x2

Output

[1] "Africa" "America" "America" "America" "Europe" "Europe" "Europe"
[8] "Asia" "Asia" "Europe" "Oceania" "Oceania" "Asia" "Europe"
[15] "Africa" "Europe" "Asia" "America" "Oceania" "Oceania" "Europe"
[22] "Asia" "Europe" "Africa" "Asia" "America" "Oceania" "Europe"
[29] "Asia" "Africa" "America" "Asia" "Europe" "Europe" "America"
[36] "Europe" "Oceania" "Oceania" "Asia" "America" "Oceania" "Africa"
[43] "Europe" "America" "Europe" "Asia" "Asia" "Oceania" "Oceania"
[50] "Oceania" "Europe" "Africa" "Asia" "Africa" "Asia" "Asia"
[57] "Oceania" "Africa" "Europe" "Asia" "Oceania" "Asia" "Asia"
[64] "Africa" "Oceania" "Europe" "Asia" "Oceania" "Africa" "Africa"
[71] "Oceania" "Europe" "Europe" "America" "Oceania" "Europe" "Africa"
[78] "Asia" "Europe" "Europe" "Europe" "Europe" "Oceania" "Africa"
[85] "Africa" "Africa" "Europe" "Oceania" "Oceania" "Europe" "Europe"
[92] "America" "Asia" "Asia" "Europe" "Oceania" "Africa" "Africa"
[99] "Oceania" "Africa"

nchar(gsub("[^a]","",x2))

[1] 1 1 1 1 0 0 0 1 1 0 2 2 1 0 1 0 1 1 2 2 0 1 0 1 1 1 2 0 1 1 1 1 0 0 1 0 2
[38] 2 1 1 2 1 0 1 0 1 1 2 2 2 0 1 1 1 1 1 2 1 0 1 2 1 1 1 2 0 1 2 1 1 2 0 0 1
[75] 2 0 1 1 0 0 0 0 2 1 1 1 0 2 2 0 0 1 1 1 0 2 1 1 2 1

nchar(gsub("[^e]","",x2))

[1] 0 1 1 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 1 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 1 1
[38] 1 0 1 1 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 1 0 0 1 1 1 1
[75] 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 0 0 1 0

nchar(gsub("[^A]","",x2))

[1] 1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 0 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 0 0 1 0 0
[38] 0 1 1 0 1 0 1 0 1 1 0 0 0 0 1 1 1 1 1 0 1 0 1 0 1 1 1 0 0 1 0 1 1 0 0 0 1
[75] 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 0 1 1 0 1

Example3

 Live Demo

x3<-sample(c("sunny","cloudy","rain","snow","stormy","fog"),100,replace=TRUE)
x3

Output

[1] "rain" "fog" "sunny" "fog" "fog" "cloudy" "stormy" "sunny"
[9] "snow" "stormy" "sunny" "snow" "snow" "cloudy" "cloudy" "cloudy"
[17] "cloudy" "sunny" "stormy" "rain" "cloudy" "fog" "sunny" "rain"
[25] "sunny" "snow" "rain" "stormy" "sunny" "stormy" "cloudy" "sunny"
[33] "cloudy" "cloudy" "fog" "fog" "sunny" "fog" "stormy" "stormy"
[41] "stormy" "stormy" "fog" "fog" "snow" "stormy" "sunny" "sunny"
[49] "sunny" "fog" "fog" "stormy" "rain" "rain" "cloudy" "cloudy"
[57] "snow" "stormy" "fog" "rain" "fog" "fog" "sunny" "sunny"
[65] "rain" "stormy" "fog" "snow" "sunny" "sunny" "snow" "stormy"
[73] "cloudy" "stormy" "fog" "rain" "rain" "rain" "rain" "fog"
[81] "cloudy" "stormy" "stormy" "cloudy" "sunny" "cloudy" "cloudy" "rain"
[89] "cloudy" "cloudy" "sunny" "rain" "sunny" "stormy" "snow" "fog"
[97] "snow" "fog" "rain" "fog"

nchar(gsub("[^a]","",x3))

[1] 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
[38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0
[75] 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0

nchar(gsub("[^o]","",x3))

[1] 0 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 0 1 1 0 0 0 1 0 1 0 1 1 0 1 1 1 1 0
[38] 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 1 1 0 1 1 0 0 0 1 1 1 0 0 1 1 1 1
[75] 1 0 0 0 0 1 1 1 1 1 0 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1

nchar(gsub("[^n]","",x3))

[1] 1 0 2 0 0 0 0 2 1 0 2 1 1 0 0 0 0 2 0 1 0 0 2 1 2 1 1 0 2 0 0 2 0 0 0 0 2
[38] 0 0 0 0 0 0 0 1 0 2 2 2 0 0 0 1 1 0 0 1 0 0 1 0 0 2 2 1 0 0 1 2 2 1 0 0 0
[75] 0 1 1 1 1 0 0 0 0 0 2 0 0 1 0 0 2 1 2 0 1 0 1 0 1 0

Updated on: 06-Mar-2021

107 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements