- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
String Manipulation in R with stringr
The stringr package is a popular R package that provides functions and tools for manipulating and processing strings in R. This package provides a consistent and convenient interface for working with strings, and it offers a wide range of functions for tasks such as searching, matching, replacing, and splitting strings.
In this article, we will discuss string manipulation in R with "stringr” package. The “stringr” package provides us the following families of functions in “stringr” −
Character manipulating functions: Such functions allows us to deal with the characters of a string.
A family of functions to deal with whitespaces.
A family of functions whose operations depend on the locale.
A family of pattern-matching functions to deal with.
In this tutorial, we will discuss these families of functions in detail.
Dealing with individual characters of a string
In this section, we will see the functions using which we can deal with individual characters of a string. These functions are the following −
str_length() function
The stringr provides us str_length() functions using which we can get the number of characters present in a string. Now let us consider the following program illustrating the working of this function −
Example
library(stringr) myString = "tutorialspoint" print(paste("The length of the string is equal to", str_length(myString)))
Output
[1] "The length of the string is equal to 14"
As you can see in the output the length of the myString has been displayed.
str_sub() function
Extracting a substring from a string −
The str_sub() function is used to get the consecutive characters of a string. Let us consider the following program illustrating the working of this function −
Example
library(stringr) myString = "tutorialspoint" print(paste("The substring is", str_sub(myString, 0, 9)))
Output
[1] "The substring is tutorials"
As you can see in the output, the substring “tutorials” has been extracted.
Extracting a substring from a vector of strings −
Example
We can also pass a vector to the str_sub() function and the substrings will be extracted from all the individual strings. Let us consider the following program illustrating the working of this function −
library(stringr) myVector <- c("tutorialspoint", "Bhuwanesh Nainwal") print(paste("The substring is", str_sub(myVector, 0, 9)))
Output
[1] "The substring is tutorials" "The substring is Bhuwanesh"
The output shows “Bhuwanesh” as the substring because it starts from the index 0 and ends at the index 9 in the original string: “Bhuwanesh Nainwal”.
Modify a string or vector of strings −
Example
The str_sub() function can also be used to modify the string or vector of strings. For example, consider the following program −
library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "TutorialSpoint") str_sub(myString, 0, 9) <- "TUTORIALS" str_sub(myVector, 0, 9) <- "TUTORIALS" print(myString) print(myVector)
Output
[1] "TUTORIALSpoint" [1] "TUTORIALSpoint" "TUTORIALSpoint"
As you can see in the output, the strings starting from the index 0 and ending at the index 9 have been replaced by the string: “TUTORIALS”.
Dealing with whitespaces of a string
In this section, we will discuss how we can deal with whitespaces using stringr library. There are two important functions that can be used to deal with whitespaces and have been described below −
str_pad()
This function pads a string by adding whitespaces on either or both ends of the string. It has three versions for adding whitespaces at the left end, at the right end, or both ends. Let us discuss them one by one −
Adding whitespaces at the left end of the string(s) −
The number of whitespaces to be added is decided as a maximum of N - length(myString) and 0. For example, if our string is “Bhuwanesh” (Note that the number of characters is equal to 9) then if we pass the value of N as 12 then three whitespaces will be added or If we pass the value of N as 8 so no whitespace will be added and there will be no modifications in the original string.
Example
library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_pad(myString, 20)) print(str_pad(myVector, 24)) print(str_pad(myString, 8)) print(str_pad(myVector, 7))
Output
[1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh" [1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh"
The whitespaces having added at the left end of strings.
Adding whitespaces at the right end of the string(s) −
The number of whitespaces to be added is decided as the maximum of N - length(myString) and 0. For example, if our string is “Bhuwanesh” (Note that the number of characters is equal to 9) then if we pass the value of N as 12 then three whitespaces will be added or If we pass the value of N as 8 so no whitespace will be added and there will be no modifications in the original string.
Example
library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_pad(myString, 20, "right")) print(str_pad(myVector, 24, "right")) print(str_pad(myString, 8, "right")) print(str_pad(myVector, 7, "right"))
Output
[1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh" [1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh"
As clearly visible in the output, whitespaces have been added in the right end.
Adding whitespaces at both ends of the string(s) −
The number of whitespaces to be added is decided as a maximum of N - length(myString) and 0. For example, if our string is “Bhuwanesh” (Note that the number of characters is equal to 9) then if we pass the value of N as 12 then three whitespaces will be added or If we pass the value of N as 8 so no whitespace will be added and there will be no modifications in the original string.
Example
Let us consider the following program illustrating the working of this function −
library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_pad(myString, 20, "both")) print(str_pad(myVector, 24, "both")) print(str_pad(myString, 8, "both")) print(str_pad(myVector, 7, "both"))
Output
[1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh" [1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh"
As you can see in the output, whitespaces have been added at both the ends.
str_trim()
This function is just opposite of str_pads() function. It trims whitespace present in either or both ends of the string. It also has three versions to trim whitespaces from the left end, from the right end, or from both ends.
Example
Let us consider the following program illustrating the working of this function −
library(stringr) myString = "tutorialspoint" myVector <- c("tutorialspoint", "Bhuwanesh") print(str_trim(myString, "left")) print(str_trim(myString, "right")) print(str_trim(myString, "both")) print(str_trim(myVector, "left")) print(str_trim(myVector, "right")) print(str_trim(myVector, "both"))
Output
[1] "tutorialspoint" [1] "tutorialspoint" [1] "tutorialspoint" [1] "tutorialspoint" "Bhuwanesh" [1] "tutorialspoint" "Bhuwanesh" [1] "tutorialspoint" "Bhuwanesh"
As you can see the string(s) have been displayed after removing whitespaces based on the arguments “left”, “right” or “both” accordingly.
Dealing with locale-sensitive functions
The “stringr” library also provides us with such functions that are locale-sensitive. By locale-sensitive we mean they perform differently in different scenarios. Such functions are discussed below in detail −
str_to_title() and str_to_upper() functions
These functions are generally used to capitalize the characters of the strings.
Example
Let us consider the following examples illustrating the working of these functions −
library(stringr) myString = "tutorialspoint is the greatest learning source" myVector <- c("tutorialspoint is the greatest learning source", "Bhuwanesh is an author") print(str_to_title(myString)) print(str_to_title(myVector)) print(str_to_upper(myString)) print(str_to_title(myVector))
Output
[1] "Tutorialspoint Is The Greatest Learning Source" [1] "Tutorialspoint Is The Greatest Learning Source" [2] "Bhuwanesh Is An Author" [1] "TUTORIALSPOINT IS THE GREATEST LEARNING SOURCE" [1] "Tutorialspoint Is The Greatest Learning Source" [2] "Bhuwanesh Is An Author"
As you see in the output, the first character of all the words have been capitalized when str_to_title() function is applied and all the words have been capitalized in case of str_to_upper() function.
We can also use functions like str_sort() to sort a vector of strings and str_order() to order a character vector. The working of these functions have been illustrated below −
Example
library(stringr) myVector <- c("Harshit", "Tutorialspoint", "Bhuwanesh", "Nainwal") cat("The vector after sorting:
") print(str_sort(myVector)) cat("The letters after ordering:
") print(str_order(letters))
Output
The vector after sorting: [1] "Bhuwanesh" "Harshit" "Nainwal" "Tutorialspoint" The letters after ordering: [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [26] 26
Pattern Matching
Sometimes, we may require to match a pattern in the given string. The stringr library provides us with the following functions that can deal with pattern matching −
str_detect()
The str_detect() function is used to detect whether the pattern matches with the string or not.
Example
library(stringr) myString = "219 733 8965" myVector <- c( "tutorialspoint", "219 733 8965", "529-295-8753", "519 633 1965" ) pattern <- "([2-9][0-9]{2})[- .]([2-9]{3})[- .]([2-9]{2})" print(str_detect(myString, pattern)) print(str_detect(myVector, pattern))
Output
[1] TRUE [1] FALSE TRUE TRUE FALSE
As you can see in the output, where the pattern matches with the string the value returned by the function is True. Otherwise, the value is False.
Conclusion
In this tutorial, we discussed how we can manipulate strings in R using stringr library. We discussed four different families that fall under the stringr library. We believe that the tutorial has surely helped you.
- Related Articles
- Data Manipulation in R with data.table
- String manipulation instructions in 8086 microprocessor
- String Manipulation in Bash on Linux
- Reserve memory in Arduino for string manipulation
- Data manipulation with JavaScript
- Voca: The Ultimate Javascript library for String Manipulation
- Matrix manipulation in Python
- How to replace “and” in a string with “&” in R?
- How to concatenate string vectors separated with hyphen in R?
- Data Manipulation Commands in DBMS
- Bit manipulation program in 8051
- Array Type Manipulation in C++
- MySQL string manipulation to count only sub-part of duplicate values in IP ADDRESS records?
- File system manipulation
- How to replace total string if partial string matches with another string in R data frame column?
