Program to count k length substring that occurs more than once in the given string in Python

Sometimes we need to count how many k-length substrings appear more than once in a given string. This is useful for pattern analysis and string processing tasks.

So, if the input is like s = "xxxyyy", k = 2, then the output will be 2 because substrings "xx" and "yy" each occur more than once.

Algorithm Steps

To solve this, we will follow these steps −

  • Create a list to store all k-length substrings
  • For i in range 0 to size of s - k, do
    • Extract substring of s [from index i to i + k - 1]
    • Add substring to the list
  • Count occurrences of each unique substring
  • Return count of substrings that appear more than once

Using Collections Counter

The most efficient approach uses Python's Counter class to count substring occurrences ?

from collections import Counter

def count_repeated_substrings(s, k):
    substrings = []
    
    # Extract all k-length substrings
    for i in range(len(s) - k + 1):
        substring = s[i:i + k]
        substrings.append(substring)
    
    # Count occurrences of each substring
    counter = Counter(substrings)
    
    # Count substrings that occur more than once
    return sum(1 for count in counter.values() if count > 1)

# Test the function
result = count_repeated_substrings("xxxyyy", 2)
print(f"Number of repeated substrings: {result}")
Number of repeated substrings: 2

Step-by-Step Example

Let's trace through the example "xxxyyy" with k=2 ?

from collections import Counter

s = "xxxyyy"
k = 2

# Extract all 2-length substrings
substrings = []
for i in range(len(s) - k + 1):
    substring = s[i:i + k]
    substrings.append(substring)
    print(f"Position {i}: '{substring}'")

print(f"\nAll substrings: {substrings}")

# Count occurrences
counter = Counter(substrings)
print(f"Substring counts: {dict(counter)}")

# Find repeated substrings
repeated_count = sum(1 for count in counter.values() if count > 1)
print(f"Number of repeated substrings: {repeated_count}")
Position 0: 'xx'
Position 1: 'xx'
Position 2: 'xy'
Position 3: 'yy'
Position 4: 'yy'

All substrings: ['xx', 'xx', 'xy', 'yy', 'yy']
Substring counts: {'xx': 2, 'xy': 1, 'yy': 2}
Number of repeated substrings: 2

Alternative Approach Using Dictionary

You can also solve this using a regular dictionary instead of Counter ?

def count_repeated_substrings_dict(s, k):
    substring_count = {}
    
    # Extract and count k-length substrings
    for i in range(len(s) - k + 1):
        substring = s[i:i + k]
        substring_count[substring] = substring_count.get(substring, 0) + 1
    
    # Count substrings that occur more than once
    repeated_count = 0
    for count in substring_count.values():
        if count > 1:
            repeated_count += 1
    
    return repeated_count

# Test with different examples
test_cases = [
    ("xxxyyy", 2),
    ("abcabc", 3),
    ("aaaa", 2),
    ("abcd", 2)
]

for string, k in test_cases:
    result = count_repeated_substrings_dict(string, k)
    print(f"String: '{string}', k={k} ? Repeated substrings: {result}")
String: 'xxxyyy', k=2 ? Repeated substrings: 2
String: 'abcabc', k=3 ? Repeated substrings: 1
String: 'aaaa', k=2 ? Repeated substrings: 1
String: 'abcd', k=2 ? Repeated substrings: 0

Conclusion

Use Python's Counter class for efficient substring counting. The algorithm extracts all k-length substrings, counts their occurrences, and returns how many appear more than once.

Updated on: 2026-03-25T11:06:33+05:30

274 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements