Python program to split a string by the given list of strings


In this article, we will explore how to split a string in Python using a given list of strings. We will dive into the step-by-step process of creating a Python program that can handle this task effectively. Whether you're dealing with text processing, data parsing, or any other scenario that involves manipulating strings, the ability to split a string based on a dynamic list of substrings can greatly simplify your code and enhance its flexibility.

Approach and Algorithm

To solve the problem of splitting a string by a given list of strings, we can follow a systematic approach that involves iterating over the string and checking for the occurrence of each substring in the list. Here's a high-level overview of the algorithm we'll use −

  • Initialize an empty result list to store the split parts of the string.

  • Iterate over the characters of the string.

  • Check if the current position matches the starting character of any substring in the list.

  • If a match is found, check if the subsequent characters match the corresponding substring.

  • If a complete match is found, add the substring to the result list and update the current position accordingly.

  • If no match is found, append the current character to the last part in the result list.

  • Repeat steps 3-6 until the entire string has been processed.

  • Return the final result list containing the split parts of the string.

By following this approach, we can effectively split the string based on the given list of substrings. In the next section, we'll dive into the implementation details and provide a Python code solution that incorporates this algorithm.

Implementation in Python

Now that we have a clear understanding of the approach and algorithm, let's dive into the implementation details in Python. We will provide a step-by-step guide and explain each component of the code to ensure a thorough understanding. You can write up the program in your favorite text editor or Python IDE!

def split_string_by_list(string, substrings):
    result = []
    i = 0
    while i < len(string):
        match = False
        for substring in substrings:
            if string[i:i + len(substring)] == substring:
                result.append(substring)
                i += len(substring)
                match = True
                break
        if not match:
            if result:
                result[-1] += string[i]
            else:
                result.append(string[i])
            i += 1
    return result

Let's break down the code and explain each step 

  • We define a function called split_string_by_list that takes two arguments: string (the input string to be split) and substrings (the list of substrings to split the string by).

  • We initialize an empty list result to store the split parts of the string.

  • We initialize a variable i to keep track of the current position while iterating over the string.

  • We start a while loop that continues until we have processed the entire string.

  • Within the loop, we initialize a boolean variable match to track if a match is found for the current position.

  • We iterate over each substring in the substrings list.

  • We check if the substring starting from the current position in the string matches the current substring.

  • If a match is found, we add the substring to the result list, update the current position (i) by adding the length of the substring, set match to True, and break out of the inner loop.

  • If no match is found, we check if the result list is not empty.

  • If the result list is not empty, we append the current character to the last part in the result list.

  • If the result list is empty, we create a new part in the result list containing only the current character.

  • Finally, we increment the current position (i) by 1 to move to the next character in the string.

  • Once the entire string has been processed, we return the final result list containing the split parts of the string.

Now that we have the implementation ready, in the next section, we will showcase some example usages and test cases to demonstrate the functionality of our Python program.

Example Usage and Test Cases

To ensure the correctness and effectiveness of our Python program, let's explore some example usages and test cases. We will provide sample input strings along with the expected output after splitting. This will help us understand how our program handles different scenarios.

Let's consider the following examples −

Example

string = "Hello, world! This is a sample string." substrings = ["world", "sample"] output = split_string_by_list(string, substrings) print(output)

Output

['Hello, ', ' This is a ', ' string.']

In this example, we expect our program to split the input string "Hello, world! This is a sample string." into three parts: "Hello, ", "This is a ", and " string.". The substrings used for splitting are "world" and "sample".

Example

string = "OpenAI is revolutionizing the field of artificial intelligence." substrings = ["Open", "revolutionizing", "intelligence"] output = split_string_by_list(string, substrings) print(output)

Output

['AI is ', ' the field of artificial ', '.']

In this example, our program should split the input string "OpenAI is revolutionizing the field of artificial intelligence." into three parts: "AI is ", " the field of artificial ", and ".". The substrings used for splitting are "Open", "revolutionizing", and "intelligence".

Different Methods to Split a String in Python

Python provides several built-in methods and techniques to split a string based on various delimiters or patterns. While our focus in this article is on splitting a string by a given list of strings, let's briefly explore other methods for string splitting in Python.

Splitting by a Single Character

The most basic and commonly used method to split a string in Python is by using the split() method. By default, this method splits a string into substrings whenever it encounters whitespace characters. For example −

Example

string = "Hello, world! This is a sample string." parts = string.split() # Default split using whitespace print(parts)

Output

['Hello,', 'world!', 'This', 'is', 'a', 'sample', 'string.']

In addition to the default behavior, you can also specify a specific delimiter character to split the string. For example, splitting by comma (,) can be done as follows 

Example

string = "Apple, Banana, Orange" fruits = string.split(", ") # Split by comma followed by space print(fruits)

Output

['Apple', 'Banana', 'Orange']

Splitting by Regular Expressions

Python's re module provides powerful capabilities to split a string based on regular expressions. The re.split() function allows you to split a string using a regex pattern as the delimiter. This gives you more control and flexibility in defining the splitting criteria. For example 

Example

import re string = "The quick brown fox jumps over the lazy dog." words = re.split(r"\W+", string) # Split by non-word characters print(words)

Output

['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog', '']

In this example, the string is split using the regex pattern \W+, which matches one or more non-word characters.

Conclusion

In this article, we explored the process of splitting a string by a given list of strings using Python. We started by understanding the problem and its significance in various programming scenarios. We then outlined an approach and algorithm to tackle this task effectively.

In addition, we briefly explored other methods for string splitting in Python, such as splitting by a single character or using regular expressions. Understanding these different techniques provides you with a wide range of options when it comes to manipulating and extracting information from strings.

Updated on: 10-Aug-2023

86 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements