Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python - K length consecutive characters
Consecutive characters are those characters that appear one after the other. K-length consecutive characters mean the same character appearing k times consecutively. In this article, we will explore several methods to find such patterns using brute force, regular expressions, sliding windows, and NumPy arrays.
Using the Brute Force Method
The brute force approach checks every possible substring of length k to see if all characters are the same
Initialize an empty list to store results
Iterate through the string, checking n-k+1 positions
Extract k-length substrings and check if all characters are identical
Use a set to find unique characters if set length is 1, all characters match
Example
The following function finds all k-length consecutive character sequences ?
def find_consecutive_characters(string, k):
n = len(string)
result = []
for i in range(n - k + 1):
substring = string[i:i+k]
if len(set(substring)) == 1:
result.append(substring)
return result
test_string = "aaabcedfffghikkk"
k = 3
print(f"Consecutive characters with length {k} are: {find_consecutive_characters(test_string, k)}")
Consecutive characters with length 3 are: ['aaa', 'fff', 'kkk']
Using Regular Expressions
The re library provides powerful pattern matching capabilities. We can create a regex pattern to find repeated characters efficiently ?
import re
def find_consecutive_characters(string, k):
pattern = r"((.)\2{%d})" % (k - 1)
result = re.findall(pattern, string)
result = [match[0] for match in result]
return result
test_string = "abdffghttpplihdf"
k = 2
print(f"Consecutive characters with length {k} are: {find_consecutive_characters(test_string, k)}")
Consecutive characters with length 2 are: ['ff', 'tt', 'pp']
The pattern ((.)\2{k-1}) captures a character and repeats it k-1 more times, ensuring exactly k consecutive occurrences.
Using Sliding Window Technique
The sliding window approach maintains a fixed-size window that slides through the string, updating efficiently by removing the first character and adding a new one ?
def find_consecutive_characters(string, k):
n = len(string)
result = []
# Initialize the first window
window = string[:k]
if len(set(window)) == 1:
result.append(window)
# Slide the window through the rest of the string
for i in range(k, n):
window = window[1:] + string[i]
if len(set(window)) == 1:
result.append(window)
return result
test_string = "xxxxangduuuu"
k = 4
print(f"Consecutive characters with length {k} are: {find_consecutive_characters(test_string, k)}")
Consecutive characters with length 4 are: ['xxxx', 'uuuu']
Using NumPy Library
NumPy provides efficient array operations and sliding window views for optimized pattern detection ?
import numpy as np
def find_consecutive_characters(string, k):
# Convert string to numpy array of bytes
arr = np.frombuffer(string.encode(), dtype=np.uint8)
# Create sliding windows
windows = np.lib.stride_tricks.sliding_window_view(arr, k)
# Find windows with only one unique character
result = [window.tobytes().decode() for window in windows if np.unique(window).size == 1]
return result
test_string = "xxxxangduuuu"
k = 4
print(f"Consecutive characters with length {k} are: {find_consecutive_characters(test_string, k)}")
Consecutive characters with length 4 are: ['xxxx', 'uuuu']
Performance Comparison
| Method | Time Complexity | Best For |
|---|---|---|
| Brute Force | O(n*k) | Simple understanding |
| Regular Expressions | O(n) | Complex pattern matching |
| Sliding Window | O(n*k) | Memory efficiency |
| NumPy | O(n) | Large datasets |
Conclusion
Regular expressions provide the most elegant solution for finding k-length consecutive characters, while NumPy offers the best performance for large datasets. The sliding window approach balances simplicity and efficiency for most use cases.
