Positions of Large Groups in Python

Sometimes we need to find positions of consecutive character groups in a string that have 3 or more characters. For example, in the string "abbxxxxzyy", the groups are "a", "bb", "xxxx", "z", and "yy", where only "xxxx" qualifies as a large group.

Problem Understanding

Given a string of lowercase letters, we need to identify large groups (3+ consecutive identical characters) and return their starting and ending positions as a list of ranges.

Approach

We'll use Python's itertools.groupby() to group consecutive identical characters, then check if each group has 3 or more characters ?

from itertools import groupby

def largeGroupPositions(s):
    ans = []
    position = 0
    
    for char, group in groupby(s):
        group_list = list(group)
        group_size = len(group_list)
        
        if group_size >= 3:
            start = position
            end = position + group_size - 1
            ans.append([start, end])
        
        position += group_size
    
    return ans

# Test with the example
result = largeGroupPositions("abcdddeeeeaabbbcd")
print(result)
[[3, 5], [6, 9], [12, 14]]

How It Works

Let's trace through the string "abcdddeeeeaabbbcd" ?

from itertools import groupby

def largeGroupPositions(s):
    ans = []
    position = 0
    
    print(f"Processing string: '{s}'")
    print("Character | Group Size | Positions | Large Group?")
    print("-" * 50)
    
    for char, group in groupby(s):
        group_list = list(group)
        group_size = len(group_list)
        
        start = position
        end = position + group_size - 1
        is_large = group_size >= 3
        
        print(f"    {char}     |     {group_size}      | [{start}, {end}]    |    {is_large}")
        
        if is_large:
            ans.append([start, end])
        
        position += group_size
    
    return ans

result = largeGroupPositions("abcdddeeeeaabbbcd")
print(f"\nLarge groups: {result}")
Processing string: 'abcdddeeeeaabbbcd'
Character | Group Size | Positions | Large Group?
--------------------------------------------------
    a     |     1      | [0, 0]    |    False
    b     |     1      | [1, 1]    |    False
    c     |     1      | [2, 2]    |    False
    d     |     3      | [3, 5]    |    True
    e     |     4      | [6, 9]    |    True
    a     |     2      | [10, 11]    |    False
    b     |     3      | [12, 14]    |    True
    c     |     1      | [15, 15]    |    False
    d     |     1      | [16, 16]    |    False

Large groups: [[3, 5], [6, 9], [12, 14]]

Alternative Approach Without groupby()

We can also solve this using a simple loop to track consecutive characters ?

def largeGroupPositions(s):
    if not s:
        return []
    
    ans = []
    start = 0
    
    for i in range(1, len(s) + 1):
        # Check if we've reached end or found a different character
        if i == len(s) or s[i] != s[start]:
            group_size = i - start
            if group_size >= 3:
                ans.append([start, i - 1])
            start = i
    
    return ans

# Test with different examples
test_cases = ["abcdddeeeeaabbbcd", "abbxxxxzyy", "abc", "aaabbbbccc"]

for test in test_cases:
    result = largeGroupPositions(test)
    print(f"Input: '{test}' ? Output: {result}")
Input: 'abcdddeeeeaabbbcd' ? Output: [[3, 5], [6, 9], [12, 14]]
Input: 'abbxxxxzyy' ? Output: [[3, 6]]
Input: 'abc' ? Output: []
Input: 'aaabbbbccc' ? Output: [[0, 2], [3, 6], [7, 9]]

Comparison

Method Time Complexity Space Complexity Readability
groupby() O(n) O(n) High
Two pointers O(n) O(1) Medium

Conclusion

Use itertools.groupby() for readable code when grouping consecutive characters. The two-pointer approach is more memory-efficient but slightly more complex to implement.

Updated on: 2026-03-25T08:51:41+05:30

288 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements