Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to count k length substring that occurs more than once in the given string in Python
Sometimes we need to count how many k-length substrings appear more than once in a given string. This is useful for pattern analysis and string processing tasks.
So, if the input is like s = "xxxyyy", k = 2, then the output will be 2 because substrings "xx" and "yy" each occur more than once.
Algorithm Steps
To solve this, we will follow these steps −
- Create a list to store all k-length substrings
- For i in range 0 to size of s - k, do
- Extract substring of s [from index i to i + k - 1]
- Add substring to the list
- Count occurrences of each unique substring
- Return count of substrings that appear more than once
Using Collections Counter
The most efficient approach uses Python's Counter class to count substring occurrences ?
from collections import Counter
def count_repeated_substrings(s, k):
substrings = []
# Extract all k-length substrings
for i in range(len(s) - k + 1):
substring = s[i:i + k]
substrings.append(substring)
# Count occurrences of each substring
counter = Counter(substrings)
# Count substrings that occur more than once
return sum(1 for count in counter.values() if count > 1)
# Test the function
result = count_repeated_substrings("xxxyyy", 2)
print(f"Number of repeated substrings: {result}")
Number of repeated substrings: 2
Step-by-Step Example
Let's trace through the example "xxxyyy" with k=2 ?
from collections import Counter
s = "xxxyyy"
k = 2
# Extract all 2-length substrings
substrings = []
for i in range(len(s) - k + 1):
substring = s[i:i + k]
substrings.append(substring)
print(f"Position {i}: '{substring}'")
print(f"\nAll substrings: {substrings}")
# Count occurrences
counter = Counter(substrings)
print(f"Substring counts: {dict(counter)}")
# Find repeated substrings
repeated_count = sum(1 for count in counter.values() if count > 1)
print(f"Number of repeated substrings: {repeated_count}")
Position 0: 'xx'
Position 1: 'xx'
Position 2: 'xy'
Position 3: 'yy'
Position 4: 'yy'
All substrings: ['xx', 'xx', 'xy', 'yy', 'yy']
Substring counts: {'xx': 2, 'xy': 1, 'yy': 2}
Number of repeated substrings: 2
Alternative Approach Using Dictionary
You can also solve this using a regular dictionary instead of Counter ?
def count_repeated_substrings_dict(s, k):
substring_count = {}
# Extract and count k-length substrings
for i in range(len(s) - k + 1):
substring = s[i:i + k]
substring_count[substring] = substring_count.get(substring, 0) + 1
# Count substrings that occur more than once
repeated_count = 0
for count in substring_count.values():
if count > 1:
repeated_count += 1
return repeated_count
# Test with different examples
test_cases = [
("xxxyyy", 2),
("abcabc", 3),
("aaaa", 2),
("abcd", 2)
]
for string, k in test_cases:
result = count_repeated_substrings_dict(string, k)
print(f"String: '{string}', k={k} ? Repeated substrings: {result}")
String: 'xxxyyy', k=2 ? Repeated substrings: 2 String: 'abcabc', k=3 ? Repeated substrings: 1 String: 'aaaa', k=2 ? Repeated substrings: 1 String: 'abcd', k=2 ? Repeated substrings: 0
Conclusion
Use Python's Counter class for efficient substring counting. The algorithm extracts all k-length substrings, counts their occurrences, and returns how many appear more than once.
