Word Frequency - Problem

Imagine you're a text analyst tasked with understanding the most common words in a document. Your job is to create a bash script that counts how many times each word appears in a text file called words.txt.

The output should display each unique word followed by its frequency count, sorted by frequency in descending order. For words with the same frequency, sort them alphabetically.

Example: If words.txt contains:
the quick brown fox jumps over the lazy dog the fox

Your script should output:
the 3
fox 2
brown 1
dog 1
jumps 1
lazy 1
over 1
quick 1

Constraints:

  • Input contains only lowercase letters and spaces
  • Words are separated by one or more whitespace characters
  • Output format: word frequency (space-separated)

Input & Output

example_1.txt โ€” Basic word counting
$ Input: words.txt contains: the quick brown fox jumps over the lazy dog
โ€บ Output: the 2 brown 1 dog 1 fox 1 jumps 1 lazy 1 over 1 quick 1
๐Ÿ’ก Note: The word 'the' appears twice, all other words appear once. Results are sorted by frequency (descending) then alphabetically.
example_2.txt โ€” Multiple spaces handling
$ Input: words.txt contains: hello world hello universe world
โ€บ Output: hello 2 world 2 universe 1
๐Ÿ’ก Note: Multiple consecutive spaces are handled correctly. Words with same frequency (hello, world) are sorted alphabetically.
example_3.txt โ€” Single word file
$ Input: words.txt contains: test
โ€บ Output: test 1
๐Ÿ’ก Note: Edge case with only one word should output that word with count 1.

Visualization

Tap to expand
๐Ÿ“ Input Text Processing"the quick brown fox jumps over the lazy dog the fox"Hash Tablethe: 3fox: 2quick: 1brown: 1...Sorting Process1. By frequency โ†“2. Then alphabeticallythe(3) > fox(2) > others(1)For freq=1: brown, dog,jumps, lazy, over, quickFinal Outputthe 3fox 2brown 1dog 1...โšก Key Performance InsightSingle pass O(n) + Sort O(k log k) = Optimal Solution!
Understanding the Visualization
1
Initialize Counter
Create an empty associative array to track word frequencies
2
Single Pass Processing
Read each word and increment its counter in the hash table
3
Sort Results
Sort by frequency (descending) then alphabetically for ties
4
Output Report
Display each word with its count in the specified format
Key Takeaway
๐ŸŽฏ Key Insight: Using associative arrays (hash tables) enables efficient single-pass counting, avoiding the need to repeatedly scan the input file. This transforms an O(nยฒ) problem into an optimal O(n + k log k) solution.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n + k log k)

O(n) to read and count words, plus O(k log k) to sort k unique words. Overall linear in input size.

n
2n
โšก Linearithmic
Space Complexity
O(k)

Space for associative array storing k unique words and their counts

n
2n
โœ“ Linear Space

Constraints

  • Input file contains only lowercase letters (a-z) and space characters
  • Words are separated by one or more whitespace characters
  • File size can be up to 106 characters
  • Output format: Each line should contain 'word count' separated by a single space
  • Words with same frequency should be sorted alphabetically
Asked in
Google 25 Amazon 20 Microsoft 15 Meta 12
28.6K Views
Medium Frequency
~15 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen