String Manipulation in R with stringr


The stringr package is a popular R package that provides functions and tools for manipulating and processing strings in R. This package provides a consistent and convenient interface for working with strings, and it offers a wide range of functions for tasks such as searching, matching, replacing, and splitting strings.

In this article, we will discuss string manipulation in R with "stringr” package. The “stringr” package provides us the following families of functions in “stringr” −

  • Character manipulating functions: Such functions allows us to deal with the characters of a string.

  • A family of functions to deal with whitespaces.

  • A family of functions whose operations depend on the locale.

  • A family of pattern-matching functions to deal with.

In this tutorial, we will discuss these families of functions in detail.

Dealing with individual characters of a string

In this section, we will see the functions using which we can deal with individual characters of a string. These functions are the following −

 str_length() function

The stringr provides us str_length() functions using which we can get the number of characters present in a string. Now let us consider the following program illustrating the working of this function −

Example

library(stringr) myString = "tutorialspoint" print(paste("The length of the string is equal to", str_length(myString)))

Output

[1] "The length of the string is equal to 14"

As you can see in the output the length of the myString has been displayed.

str_sub() function

  • Extracting a substring from a string −

The str_sub() function is used to get the consecutive characters of a string. Let us consider the following program illustrating the working of this function −

Example

library(stringr) myString = "tutorialspoint" print(paste("The substring is", str_sub(myString, 0, 9)))

Output

[1] "The substring is tutorials"

As you can see in the output, the substring “tutorials” has been extracted.

  • Extracting a substring from a vector of strings −

Example

We can also pass a vector to the str_sub() function and the substrings will be extracted from all the individual strings. Let us consider the following program illustrating the working of this function −

library(stringr) myVector <- c("tutorialspoint", "Bhuwanesh Nainwal") print(paste("The substring is", str_sub(myVector, 0, 9)))

Output

[1] "The substring is tutorials" "The substring is Bhuwanesh"

The output shows “Bhuwanesh” as the substring because it starts from the index 0 and ends at the index 9 in the original string: “Bhuwanesh Nainwal”.

  • Modify a string or vector of strings −

Example

The str_sub() function can also be used to modify the string or vector of strings. For example, consider the following program −

library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "TutorialSpoint") str_sub(myString, 0, 9) <- "TUTORIALS" str_sub(myVector, 0, 9) <- "TUTORIALS" print(myString) print(myVector)

Output

[1] "TUTORIALSpoint"
[1] "TUTORIALSpoint" "TUTORIALSpoint"

As you can see in the output, the strings starting from the index 0 and ending at the index 9 have been replaced by the string: “TUTORIALS”.

Dealing with whitespaces of a string

In this section, we will discuss how we can deal with whitespaces using stringr library. There are two important functions that can be used to deal with whitespaces and have been described below −

str_pad()

This function pads a string by adding whitespaces on either or both ends of the string. It has three versions for adding whitespaces at the left end, at the right end, or both ends. Let us discuss them one by one −

Adding whitespaces at the left end of the string(s) −

The number of whitespaces to be added is decided as a maximum of N - length(myString) and 0. For example, if our string is “Bhuwanesh” (Note that the number of characters is equal to 9) then if we pass the value of N as 12 then three whitespaces will be added or If we pass the value of N as 8 so no whitespace will be added and there will be no modifications in the original string.

Example

library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_pad(myString, 20)) print(str_pad(myVector, 24)) print(str_pad(myString, 8)) print(str_pad(myVector, 7))

Output

[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"
[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"     

The whitespaces having added at the left end of strings.

Adding whitespaces at the right end of the string(s) −

The number of whitespaces to be added is decided as the maximum of N - length(myString) and 0. For example, if our string is “Bhuwanesh” (Note that the number of characters is equal to 9) then if we pass the value of N as 12 then three whitespaces will be added or If we pass the value of N as 8 so no whitespace will be added and there will be no modifications in the original string.

Example

library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_pad(myString, 20, "right")) print(str_pad(myVector, 24, "right")) print(str_pad(myString, 8, "right")) print(str_pad(myVector, 7, "right"))

Output

[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"
[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"     

As clearly visible in the output, whitespaces have been added in the right end.

Adding whitespaces at both ends of the string(s) −

The number of whitespaces to be added is decided as a maximum of N - length(myString) and 0. For example, if our string is “Bhuwanesh” (Note that the number of characters is equal to 9) then if we pass the value of N as 12 then three whitespaces will be added or If we pass the value of N as 8 so no whitespace will be added and there will be no modifications in the original string.

Example

Let us consider the following program illustrating the working of this function −

library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_pad(myString, 20, "both")) print(str_pad(myVector, 24, "both")) print(str_pad(myString, 8, "both")) print(str_pad(myVector, 7, "both"))

Output

[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"
[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"     

As you can see in the output, whitespaces have been added at both the ends.

str_trim()

This function is just opposite of str_pads() function. It trims whitespace present in either or both ends of the string. It also has three versions to trim whitespaces from the left end, from the right end, or from both ends.

Example

Let us consider the following program illustrating the working of this function −

library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_trim(myString, "left")) print(str_trim(myString, "right")) print(str_trim(myString, "both")) print(str_trim(myVector, "left")) print(str_trim(myVector, "right")) print(str_trim(myVector, "both"))

Output

[1] "tutorialspoint"
[1] "tutorialspoint"
[1] "tutorialspoint"
[1] "tutorialspoint" "Bhuwanesh"  
[1] "tutorialspoint" "Bhuwanesh"    
[1] "tutorialspoint" "Bhuwanesh"     

As you can see the string(s) have been displayed after removing whitespaces based on the arguments “left”, “right” or “both” accordingly.

Dealing with locale-sensitive functions

The “stringr” library also provides us with such functions that are locale-sensitive. By locale-sensitive we mean they perform differently in different scenarios. Such functions are discussed below in detail −

str_to_title() and str_to_upper() functions

These functions are generally used to capitalize the characters of the strings.

Example

Let us consider the following examples illustrating the working of these functions −

library(stringr) myString = "tutorialspoint is the greatest learning source" myVector <- c("tutorialspoint is the greatest learning source", "Bhuwanesh is an author") print(str_to_title(myString)) print(str_to_title(myVector)) print(str_to_upper(myString)) print(str_to_title(myVector))

Output

[1] "Tutorialspoint Is The Greatest Learning Source"
[1] "Tutorialspoint Is The Greatest Learning Source"
[2] "Bhuwanesh Is An Author"                        
[1] "TUTORIALSPOINT IS THE GREATEST LEARNING SOURCE"
[1] "Tutorialspoint Is The Greatest Learning Source"
[2] "Bhuwanesh Is An Author"    

As you see in the output, the first character of all the words have been capitalized when str_to_title() function is applied and all the words have been capitalized in case of str_to_upper() function.

We can also use functions like str_sort() to sort a vector of strings and str_order() to order a character vector. The working of these functions have been illustrated below −

Example

library(stringr) myVector <- c("Harshit", "Tutorialspoint", "Bhuwanesh", "Nainwal") cat("The vector after sorting:
"
) print(str_sort(myVector)) cat("The letters after ordering:
"
) print(str_order(letters))

Output

The vector after sorting:
[1] "Bhuwanesh"      "Harshit"        "Nainwal"        "Tutorialspoint"
The letters after ordering:
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26

Pattern Matching

Sometimes, we may require to match a pattern in the given string. The stringr library provides us with the following functions that can deal with pattern matching −

str_detect()

The str_detect() function is used to detect whether the pattern matches with the string or not.

Example

library(stringr) myString = "219 733 8965" myVector <- c( "tutorialspoint", "219 733 8965", "529-295-8753", "519 633 1965" ) pattern <- "([2-9][0-9]{2})[- .]([2-9]{3})[- .]([2-9]{2})" print(str_detect(myString, pattern)) print(str_detect(myVector, pattern))

Output

[1] TRUE
[1] FALSE  TRUE  TRUE FALSE

As you can see in the output, where the pattern matches with the string the value returned by the function is True. Otherwise, the value is False.

Conclusion

In this tutorial, we discussed how we can manipulate strings in R using stringr library. We discussed four different families that fall under the stringr library. We believe that the tutorial has surely helped you.

Updated on: 17-Jan-2023

691 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements