Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python – N sized substrings with K distinct characters
When working with strings, you might need to find all substrings of a specific length that contain exactly K distinct characters. This can be achieved by iterating through the string and using Python's set() method to count unique characters in each substring.
Syntax
The general approach involves:
for i in range(len(string) - n + 1):
substring = string[i:i+n]
if len(set(substring)) == k:
# Add to result
Example
Below is a demonstration that finds all 2-character substrings with exactly 2 distinct characters ?
my_string = 'Pythonisfun'
print("The string is :")
print(my_string)
my_substring = 2
my_chars = 2
my_result = []
for idx in range(0, len(my_string) - my_substring + 1):
if (len(set(my_string[idx: idx + my_substring])) == my_chars):
my_result.append(my_string[idx: idx + my_substring])
print("The resultant string is :")
print(my_result)
The string is : Pythonisfun The resultant string is : ['Py', 'yt', 'th', 'ho', 'on', 'ni', 'is', 'sf', 'fu', 'un']
How It Works
The algorithm iterates through each possible starting position in the string
For each position, it extracts a substring of length N
The
set()function removes duplicate characters, solen(set(substring))gives the count of distinct charactersIf this count equals K, the substring is added to the result list
Different Example
Finding 3-character substrings with exactly 3 distinct characters ?
text = "programming"
n = 3 # substring length
k = 3 # distinct characters required
result = []
for i in range(len(text) - n + 1):
substring = text[i:i+n]
if len(set(substring)) == k:
result.append(substring)
print(f"String: {text}")
print(f"3-character substrings with 3 distinct characters: {result}")
String: programming 3-character substrings with 3 distinct characters: ['pro', 'rog', 'ogr', 'gra', 'ram', 'amm', 'mmi', 'min', 'ing']
Key Points
The
set()function automatically handles duplicate character removalThe range calculation
len(string) - n + 1ensures we don't go beyond string boundsTime complexity is O(n*m) where n is string length and m is substring length
Conclusion
This approach efficiently finds all N-sized substrings containing exactly K distinct characters by combining string slicing with set operations. The set() method provides an elegant way to count unique characters in each substring.
