Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Positions of Large Groups in Python
Sometimes we need to find positions of consecutive character groups in a string that have 3 or more characters. For example, in the string "abbxxxxzyy", the groups are "a", "bb", "xxxx", "z", and "yy", where only "xxxx" qualifies as a large group.
Problem Understanding
Given a string of lowercase letters, we need to identify large groups (3+ consecutive identical characters) and return their starting and ending positions as a list of ranges.
Approach
We'll use Python's itertools.groupby() to group consecutive identical characters, then check if each group has 3 or more characters ?
from itertools import groupby
def largeGroupPositions(s):
ans = []
position = 0
for char, group in groupby(s):
group_list = list(group)
group_size = len(group_list)
if group_size >= 3:
start = position
end = position + group_size - 1
ans.append([start, end])
position += group_size
return ans
# Test with the example
result = largeGroupPositions("abcdddeeeeaabbbcd")
print(result)
[[3, 5], [6, 9], [12, 14]]
How It Works
Let's trace through the string "abcdddeeeeaabbbcd" ?
from itertools import groupby
def largeGroupPositions(s):
ans = []
position = 0
print(f"Processing string: '{s}'")
print("Character | Group Size | Positions | Large Group?")
print("-" * 50)
for char, group in groupby(s):
group_list = list(group)
group_size = len(group_list)
start = position
end = position + group_size - 1
is_large = group_size >= 3
print(f" {char} | {group_size} | [{start}, {end}] | {is_large}")
if is_large:
ans.append([start, end])
position += group_size
return ans
result = largeGroupPositions("abcdddeeeeaabbbcd")
print(f"\nLarge groups: {result}")
Processing string: 'abcdddeeeeaabbbcd'
Character | Group Size | Positions | Large Group?
--------------------------------------------------
a | 1 | [0, 0] | False
b | 1 | [1, 1] | False
c | 1 | [2, 2] | False
d | 3 | [3, 5] | True
e | 4 | [6, 9] | True
a | 2 | [10, 11] | False
b | 3 | [12, 14] | True
c | 1 | [15, 15] | False
d | 1 | [16, 16] | False
Large groups: [[3, 5], [6, 9], [12, 14]]
Alternative Approach Without groupby()
We can also solve this using a simple loop to track consecutive characters ?
def largeGroupPositions(s):
if not s:
return []
ans = []
start = 0
for i in range(1, len(s) + 1):
# Check if we've reached end or found a different character
if i == len(s) or s[i] != s[start]:
group_size = i - start
if group_size >= 3:
ans.append([start, i - 1])
start = i
return ans
# Test with different examples
test_cases = ["abcdddeeeeaabbbcd", "abbxxxxzyy", "abc", "aaabbbbccc"]
for test in test_cases:
result = largeGroupPositions(test)
print(f"Input: '{test}' ? Output: {result}")
Input: 'abcdddeeeeaabbbcd' ? Output: [[3, 5], [6, 9], [12, 14]] Input: 'abbxxxxzyy' ? Output: [[3, 6]] Input: 'abc' ? Output: [] Input: 'aaabbbbccc' ? Output: [[0, 2], [3, 6], [7, 9]]
Comparison
| Method | Time Complexity | Space Complexity | Readability |
|---|---|---|---|
groupby() |
O(n) | O(n) | High |
| Two pointers | O(n) | O(1) | Medium |
Conclusion
Use itertools.groupby() for readable code when grouping consecutive characters. The two-pointer approach is more memory-efficient but slightly more complex to implement.
