Most Common Word - Problem
You're building a word frequency analyzer for text processing! Given a paragraph of text and a list of banned words, your task is to find the most frequently occurring word that isn't on the banned list.
The challenge involves:
- Cleaning the text by removing punctuation and converting to lowercase
- Counting word frequencies while ignoring banned words
- Returning the word with the highest count
For example, in the paragraph "Bob hit a ball, the hit BALL flew far after it was hit." with banned words ["hit"], the word "ball" appears twice and is the most frequent non-banned word.
Note: It's guaranteed that there's at least one non-banned word, and the answer is unique.
Input & Output
example_1.py โ Basic Case
$
Input:
paragraph = "Bob hit a ball, the hit BALL flew far after it was hit.", banned = ["hit"]
โบ
Output:
"ball"
๐ก Note:
After removing punctuation and converting to lowercase, we have words: ["bob", "hit", "a", "ball", "the", "hit", "ball", "flew", "far", "after", "it", "was", "hit"]. Excluding "hit" (banned), "ball" appears 2 times, which is the maximum frequency.
example_2.py โ Multiple Banned Words
$
Input:
paragraph = "a, a, a, a, b,b,b,c, c", banned = ["a"]
โบ
Output:
"b"
๐ก Note:
After cleaning: ["a", "a", "a", "a", "b", "b", "b", "c", "c"]. Excluding "a" (banned), "b" appears 3 times and "c" appears 2 times. So "b" is the most frequent.
example_3.py โ Case Insensitive
$
Input:
paragraph = "Bob. hIt, baLl", banned = ["bob", "hit"]
โบ
Output:
"ball"
๐ก Note:
After converting to lowercase and removing punctuation: ["bob", "hit", "ball"]. Both "bob" and "hit" are banned, leaving only "ball" as the valid answer.
Constraints
- 1 โค paragraph.length โค 1000
- paragraph consists of English letters, space ' ', or one of the symbols: "!?',;."
- 0 โค banned.length โค 100
- 1 โค banned[i].length โค 10
- banned[i] consists of only lowercase English letters
- There is at least one word in paragraph that is not banned
- The answer is unique
Visualization
Tap to expand
Understanding the Visualization
1
Clean the Data
Remove punctuation and convert to lowercase, like standardizing ballot formats
2
Count Valid Votes
Use hash table to count each valid (non-banned) word, like tallying votes in real-time
3
Track the Leader
Keep track of the word with highest count as we process, like updating election results live
4
Declare Winner
Return the most frequent valid word, like announcing the election winner
Key Takeaway
๐ฏ Key Insight: Use a hash table to count frequencies in a single pass - this transforms an O(nยฒ) nested loop problem into an optimal O(n) solution by leveraging O(1) average-case hash table operations.
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code