How to find longest repetitive sequence in a string in Python?


To find the longest repetitive sequence in a string in Python, we can use the following approach:

Iterate through each character of the string and compare it with the next character.

If they are same, we can increase a counter variable and continue comparing with the next character.

If they are not the same, we check if the counter is greater than the length of the longest repetitive sequence we have found so far. If it is, we update the longest repetitive sequence.

Reset the counter to 1 and continue with the next character in the string.

Here are three code examples with step by step explanations:

Using a loop to find the longest repetitive sequence in a string

Example

We start by initializing the longest_sequence, sequence, and prev_char variables to empty strings.

We iterate through each character in the string using a for loop.

If the current character is the same as the previous character, we add it to the sequence string.

If the current character is different from the previous character, we check if the length of the sequence string is greater than the length of the longest_sequence string we have found so far. If it is, we update the longest_sequence string.

We then reset the sequence string to the current character and continue with the next character in the string.

After the loop has completed, we check one final time if the length of the sequence string is greater than the length of the longest_sequence string we have found so far. If it is, we update the longest_sequence string with the sequence string.

Finally, we print out the longest_sequence string.

string = "abbbbcddeeeeee"
longest_sequence = ""
sequence = ""
prev_char = ""

for char in string:
    if char == prev_char:
        sequence += char
    else:
        if len(sequence) > len(longest_sequence):
            longest_sequence = sequence
        sequence = char

    prev_char = char

if len(sequence) > len(longest_sequence):
    longest_sequence = sequence

print("Longest repetitive sequence:", longest_sequence)

Output

Longest repetitive sequence: eeeeee

Using the groupby function from the itertools module

Example

We import the groupby function from the itertools module.

We define the string we want to check for the longest repetitive sequence.

We initialize the longest_sequence variable to an empty string.

We loop through the characters in the string using the groupby function, which groups consecutive characters together.

For each group of consecutive characters, we join them together into a sequence string.

If the length of the sequence string is greater than the length of the longest_sequence string we have found so far, we update the longest_sequence string.

After the loop has completed, we print out the longest_sequence string.

from itertools import groupby

string = "abbbbcddeeeeee"

longest_sequence = ""
for char, group in groupby(string):
    sequence = "".join(list(group))
    if len(sequence) > len(longest_sequence):
        longest_sequence = sequence

print("Longest repetitive sequence:", longest_sequence)

Output

Longest repetitive sequence: eeeeee

Example

The string s contains a series of repetitive sequences of increasing length.

We initialize an empty string max_seq to hold the longest repetitive sequence.

We use a nested loop to iterate over each character in s.

For each character, we start a new sequence (seq) containing that character.

We then iterate through the remaining characters in s, checking if each one matches the first character in our current sequence.

If it does, we add it to the sequence; if not, we break out of the loop.

After each sequence is completed, we check if it's longer than the current max_seq, and update max_seq accordingly.

Finally, we print the longest repetitive sequence.

s = "abbcccddddeeeeeffffff"
max_seq = ''
for i in range(len(s)):
    seq = s[i]
    for j in range(i+1, len(s)):
        if s[j] == s[i]:
            seq += s[j]
        else:
            break
    if len(seq) > len(max_seq):

        max_seq = seq
print(max_seq)

Output

ffffff

Example

This approach is similar to the previous example, but instead of checking character by character, we compare substrings of increasing length.

We use two nested loops to iterate over each possible pair of substrings in s.

For each pair of substrings, we check if they're equal.

If they are, we update max_seq with the longer substring.

s = "abcdee"
max_seq = ''
for i in range(len(s)):
    for j in range(i+1, len(s)):
        if s[j:j+i+1] == s[i:j]:
            max_seq = s[i:j]
print(max_seq)

Output

e

Example

This approach uses the split() function to split the string into a list of substrings, using the first character as the delimiter.

We add a delimiter at the end of the string to make sure the last substring is included.

We then find the longest substring in the list using the max() function and slicing off the delimiter character from both ends.

Finally, we print the longest repetitive sequence.

s = "aabbbcddddeeeefffffff"
max_seq = max((s+'$').split(s[0]), key=len)[1:-1]
print(max_seq)

Output

bbcddddeeeefffffff

Updated on: 10-Aug-2023

834 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements