How can I use Python regex to split a string by multiple delimiters?


Classes that encompass a collection of characters are known as regular expression classes. One of these classes, d, which matches any decimal digit, will be used. Learning how to split data may be valuable. Data arrives in various kinds and sizes, and it's sometimes not as clean as we'd like. You frequently wish to divide a string by more than one delimiter to make it easier to deal with. The built-in regular expression library re is the easiest way to split a string.

The library has a.split() function that works similarly to the above example. This approach stands out since it lets you separate our strings using regular expressions.

Syntax Used

split() − Split a string into a list where each word is a list item −

string.split(separator, maxsplit)

separator − Optional. Specifies the separator to use when splitting the string. By default, any whitespace is a separator.

maxsplit − Optional. Specifies how many splits to do. The default value is -1, which is "all occurrences".

Note − If capturing parentheses are used in the pattern, then the text of all groups in the pattern is also returned as part of the resulting list.

Return Value − It divides the target text according to the regular expression pattern and returns a list of matches.

Algorithm

  • To split a string with multiple delimiters
  • Import the re module.
  • Use the re.split() method, e.g. re.split(r',|-', my_str_2).
  • The re.split() method will split the string on all occurrences of one of the delimiters.

Example 1

import re #split string with 2 delimiters my_str = 'one,two-three,four' my_list = re.split(r',|-', my_str) # split on comma or hyphen print(my_list)

Output

['one', 'two', 'three', 'four']

Code Explanation

A pattern and a string are given to the re.split function, which separates the string on each occurrence of the pattern.

The pipe | symbol represents an OR. Choose from A and B. In the first example, a comma and a hyphen are used as the string delimiters. The second example divides the string using a comma, a hyphen, and a colon as the delimiters. In your regular expression, you are free to use as many | characters as you choose. Alternatively, you may denote a group of characters using square brackets [].

Example 2

#split string with 3 delimiters import re my_str_2 = 'one,two-three:four' my_list_2 = re.split(r',|-|:', my_str_2) #comma, hyphen or colon print(my_list_2)

Output

['one', 'two', 'three', 'four']

Code Explanation

A pattern and a string are given to the re.split function, which separates the string on each occurrence of the pattern.

The pipe | symbol represents an OR. Choose from A and B. The first example uses a comma and a hyphen as the string delimiters. The second example divides the string using a comma, a hyphen, and a colon as the delimiters. In your regular expression, you are free to use as many | characters as you choose. Alternatively, you may denote a group of characters using square brackets [].

An alternative approach is to use the str.replace() method

The built-in method replace() in the Python language returns a replica of the string in which every instance of one substring has been replaced with another. A built-in Python method called string split() divides a string into a list.

To split a string with multiple delimiters −

  • Use the str.replace() method to replace the first delimiter with the second.
  • Use the str.split() method to split the string by the second delimiter.

Example 1

#importing re import re #storing the string in my_str_2 my_str_2 = 'one_two!three_four' my_list = my_str_2.replace('_', '!').split('!') #printing my_list by replacing it with multiple delimiters print(my_list)

Output

['one', 'two', 'three', 'four']

Code Explanation

Import re from regex module. Store a string in a variable named my_str_2. Use replace method from to first delimiter with the second to replace the string. Use split method to split the string by the second delimiter. Print my_list to get the desired result.

First, we replace every occurrence of the first delimiter with the second, and then we split on the second delimiter. The str.replace method returns a copy of the string with all occurrences of a substring replaced by the provided replacement.

Example 2

Additionally, you have the option of altogether avoiding the re-module. You can also complete this without the module if you feel more comfortable doing so because the module can be a bit daunting.

The example below shows how to divide a Python string with multiple delimiters by first changing values. We'll use our newly created string to replace all existing delimiters with a single, unified delimiter. Let's look at this −

#importing re import re #storing the string in my_str_2 sample_string = 'Hey! thanks for visiting, Tutorialspoint!' new_string = sample_string.replace('!', ',').replace(';', ',') split_string = new_string.split(',') #printing my_list by replacing it with multiple delimiters print(split_string)

Output

['Hey', ' thanks for visiting', ' Tutorialspoint', '']

Conclusion

The built-in regular expression library re is the easiest way to divide a string. The library has a.split() function that works similarly to the above example. This approach stands out since it lets you separate our strings using regular expressions. This article taught you how to divide a Python string using several delimiters. Both the built-in .split() procedure and the built-in regular expression re's .split() function is used to accomplish this.

Updated on: 02-Nov-2023

20K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements