Program to count number of distinct characters of every substring of a string in Python

Given a lowercase string, we need to find the sum of distinct character counts across all possible substrings. A character is considered distinct in a substring if it appears exactly once. The result should be returned modulo 10^9 + 7 for large numbers.

Problem Understanding

For string s = "xxy", let's examine all substrings ?

  • "x" (index 0): 1 distinct character

  • "x" (index 1): 1 distinct character

  • "y" (index 2): 1 distinct character

  • "xx" (indices 0-1): 0 distinct characters (x appears twice)

  • "xy" (indices 1-2): 2 distinct characters

  • "xxy" (indices 0-2): 1 distinct character (only y is distinct)

Total sum: 1 + 1 + 1 + 0 + 2 + 1 = 6

Algorithm Approach

The key insight is to track character positions and calculate contributions efficiently. For each character, we maintain a list of positions where it appears, then calculate how many substrings contain exactly one occurrence of that character.

Implementation

class Solution:
    def solve(self, s):
        m = 10 ** 9 + 7
        prev_seen = {}
        ans = 0
        
        def util(i, symbol):
            nonlocal ans
            # Initialize with -1 if symbol not seen before
            prev = prev_seen.setdefault(symbol, [-1])
            prev.append(i)
            
            # When we have at least 3 positions, calculate contribution
            if len(prev) > 2:
                left = prev.pop(0)  # Remove oldest position
                middle, right = prev
                # Count substrings where symbol appears exactly once
                cnt = (middle - left) * (right - middle)
                ans = (ans + cnt) % m
        
        # Process each character in the string
        for i, symbol in enumerate(s):
            util(i, symbol)
        
        # Final pass with string length as end boundary
        for symbol in prev_seen:
            util(len(s), symbol)
        
        return ans

# Test the solution
solution = Solution()
s = "xxy"
result = solution.solve(s)
print(f"Input: {s}")
print(f"Output: {result}")
Input: xxy
Output: 6

How It Works

The algorithm uses a sliding window approach ?

  1. Position Tracking: For each character, maintain positions where it appears

  2. Contribution Calculation: When a character has appeared 3+ times, calculate substrings where the middle occurrence is the only instance

  3. Boundary Handling: Add string length as final boundary to process remaining characters

Example with Another String

solution = Solution()
test_cases = ["abc", "aab", "abab"]

for s in test_cases:
    result = solution.solve(s)
    print(f"String: '{s}' ? Sum of distinct counts: {result}")
String: 'abc' ? Sum of distinct counts: 10
String: 'aab' ? Sum of distinct counts: 8
String: 'abab' ? Sum of distinct counts: 12

Time and Space Complexity

  • Time Complexity: O(n) where n is the string length

  • Space Complexity: O(n) for storing character positions

Conclusion

This solution efficiently counts distinct characters across all substrings using position tracking and contribution calculation. The modulo operation handles large results, making it suitable for competitive programming scenarios.

Updated on: 2026-03-25T13:58:35+05:30

426 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements