Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to Find Out a Sequence with Equivalent Frequencies in Python
Finding the longest sequence where all numbers have equal frequencies after removing at most one element is a complex frequency tracking problem. We need to monitor how frequencies change as we build the sequence and check if we can achieve uniform distribution.
Problem Understanding
Given a list of numbers, we want the longest prefix where we can remove at most one number to make all remaining numbers appear the same number of times ?
# Example: [2, 4, 4, 7, 7, 6, 6]
# All numbers appear twice, so we can use the entire sequence
numbers = [2, 4, 4, 7, 7, 6, 6]
print(f"Input: {numbers}")
print(f"Frequencies: 2?2, 4?2, 7?2, 6?2")
print("All equal frequencies, answer = 7")
Input: [2, 4, 4, 7, 7, 6, 6] Frequencies: 2?2, 4?2, 7?2, 6?2 All equal frequencies, answer = 7
Algorithm Steps
The solution tracks three key data structures ?
-
num_freq: Maps each number to its current frequency -
freq_freq: Maps each frequency to how many numbers have that frequency -
diff_freq: Set of all distinct frequencies currently present
Complete Solution
from collections import defaultdict
class Solution:
def solve(self, nums):
num_freq = defaultdict(int)
freq_freq = defaultdict(int)
diff_freq = set()
result = 1
for i, num in enumerate(nums):
# Get current frequency before increment
cur_freq = num_freq[num]
# Update frequency maps
num_freq[num] += 1
freq_freq[cur_freq] -= 1
freq_freq[cur_freq + 1] += 1
# Add new frequency to set
diff_freq.add(cur_freq + 1)
# Remove old frequency if no numbers have it anymore
if cur_freq in diff_freq and freq_freq[cur_freq] == 0:
diff_freq.remove(cur_freq)
df_list = list(diff_freq)
# Case 1: All numbers have same frequency
if len(df_list) == 1:
result = i + 1
# Case 2: Two different frequencies, check if removable
elif (len(df_list) == 2 and
any(x == 1 for x in [
abs(freq_freq[df_list[0]] - freq_freq[df_list[1]]),
freq_freq[df_list[0]],
freq_freq[df_list[1]]
]) and
any(x == 1 for x in [
abs(df_list[0] - df_list[1]),
df_list[0],
df_list[1]
])):
result = i + 1
return result
# Test with the example
solution = Solution()
numbers = [2, 4, 4, 7, 7, 6, 6]
result = solution.solve(numbers)
print(f"Longest valid sequence length: {result}")
Longest valid sequence length: 7
Step-by-Step Trace
Let's trace through a simpler example to understand the logic ?
def trace_solution(nums):
from collections import defaultdict
num_freq = defaultdict(int)
freq_freq = defaultdict(int)
diff_freq = set()
print(f"Processing: {nums}")
print("-" * 40)
for i, num in enumerate(nums):
cur_freq = num_freq[num]
num_freq[num] += 1
freq_freq[cur_freq] -= 1
freq_freq[cur_freq + 1] += 1
diff_freq.add(cur_freq + 1)
if cur_freq in diff_freq and freq_freq[cur_freq] == 0:
diff_freq.remove(cur_freq)
print(f"Step {i+1}: Added {num}")
print(f" Frequencies: {dict(num_freq)}")
print(f" Distinct freqs: {sorted(diff_freq)}")
print(f" Valid? {len(diff_freq) <= 2}")
print()
# Test with a simple case
trace_solution([1, 2, 1])
Processing: [1, 2, 1]
----------------------------------------
Step 1: Added 1
Frequencies: {1: 1}
Distinct freqs: [1]
Valid? True
Step 2: Added 2
Frequencies: {1: 1, 2: 1}
Distinct freqs: [1]
Valid? True
Step 3: Added 1
Frequencies: {1: 2, 2: 1}
Distinct freqs: [1, 2]
Valid? True
Key Conditions for Validity
A sequence is valid if we can achieve equal frequencies by removing at most one element. This happens when ?
| Condition | Description | Example |
|---|---|---|
| One frequency | All numbers appear same times | [1,1,2,2] ? freq 2 |
| Two frequencies differ by 1 | Can remove one occurrence | [1,1,1,2,2] ? freqs 3,2 |
| One number appears once | Can remove that number entirely | [1,2,2,3,3] ? remove 1 |
Conclusion
This algorithm efficiently tracks frequency distributions and identifies the longest valid prefix. The key insight is maintaining frequency-of-frequencies to quickly check if equal distribution is achievable by removing at most one element.
