Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python Program to Split joined consecutive similar characters
When working with strings containing consecutive similar characters, we often need to split them into groups. Python's groupby function from the itertools module provides an efficient way to group consecutive identical characters.
Syntax
The groupby() function groups consecutive equal elements from an iterable ?
itertools.groupby(iterable, key=None)
Example
Let's split a string with consecutive similar characters into separate groups ?
from itertools import groupby
my_string = 'pppyyytthhhhhhhoooooonnn'
print("The string is:")
print(my_string)
my_result = ["".join(grp) for elem, grp in groupby(my_string)]
print("The result is:")
print(my_result)
The string is: pppyyytthhhhhhhoooooonnn The result is: ['ppp', 'yyy', 'tt', 'hhhhhhh', 'oooooo', 'nnn']
How It Works
The groupby() function groups consecutive identical characters. For each group, it returns a key (the character) and an iterator of grouped elements. We use join() to reconstruct each group as a string ?
from itertools import groupby
text = 'aaabbcccc'
groups = []
for key, group in groupby(text):
grouped_chars = "".join(group)
print(f"Key: {key}, Group: {grouped_chars}")
groups.append(grouped_chars)
print("Final result:", groups)
Key: a, Group: aaa Key: b, Group: bb Key: c, Group: cccc Final result: ['aaa', 'bb', 'cccc']
Alternative Approach Using Regular Expressions
We can also use regex to find consecutive character patterns ?
import re
my_string = 'pppyyytthhhhhhhoooooonnn'
print("Original string:", my_string)
# Find consecutive identical characters
result = re.findall(r'(.)\1*', my_string)
grouped = [char * my_string.count(char) for char in result]
# Better approach with proper grouping
result = re.findall(r'(.)\1*', my_string)
print("Split groups:", result)
Original string: pppyyytthhhhhhhoooooonnn Split groups: ['ppp', 'yyy', 'tt', 'hhhhhhh', 'oooooo', 'nnn']
Comparison
| Method | Pros | Cons |
|---|---|---|
groupby() |
Simple, memory efficient | Requires import |
| Regular expressions | Powerful pattern matching | More complex syntax |
Conclusion
Use itertools.groupby() to efficiently split strings with consecutive similar characters into groups. This approach is memory-efficient and works well for most string processing tasks involving consecutive character patterns.
