Short Encoding of Words - Problem

A valid encoding of an array of words is any reference string s and array of indices indices such that:

  • words.length == indices.length
  • The reference string s ends with the '#' character
  • For each index indices[i], the substring of s starting from indices[i] and up to (but not including) the next '#' character is equal to words[i]

Given an array of words, return the length of the shortest reference string s possible of any valid encoding of words.

Input & Output

Example 1 — Basic Suffix Removal
$ Input: words = ["time", "me", "bell"]
Output: 10
💡 Note: The word "me" is a suffix of "time", so we can encode it as part of "time". The encoding becomes "time#bell#" with length 10.
Example 2 — No Suffixes
$ Input: words = ["t"]
Output: 2
💡 Note: Only one word, so the encoding is "t#" with length 2.
Example 3 — Multiple Suffix Relationships
$ Input: words = ["time", "me", "e"]
Output: 5
💡 Note: Both "me" and "e" are suffixes of "time", so we only need to encode "time" as "time#" with length 5.

Constraints

  • 1 ≤ words.length ≤ 2000
  • 1 ≤ words[i].length ≤ 7
  • words[i] consists of only lowercase English letters

Visualization

Tap to expand
Short Encoding of Words INPUT words array: "time" "me" "bell" [0] [1] [2] Note: "me" is suffix of "time" t i m e m e ALGORITHM STEPS 1 Add All to HashSet Set = {time, me, bell} 2 Remove Suffixes For each word, remove all its suffixes from set 3 Process "time" Suffixes: ime, me, e "me" removed from set! 4 Calculate Length Sum: (word.len + 1) for remaining words Remaining Set: {time, bell} (4+1) + (4+1) = 10 FINAL RESULT Reference String s: "time#bell#" Length = 10 characters Breakdown: "time" at index 0 "me" at index 2 "bell" at index 5 (me is contained in time!) OUTPUT 10 Key Insight: If word A is a suffix of word B, then A can be encoded within B's encoding. Using a HashSet, we remove all suffixes, keeping only words that aren't suffixes of others. Final length = sum of (word length + 1) for each remaining word ("+1" for '#' delimiter). TutorialsPoint - Short Encoding of Words | Hash Set Optimization
Asked in
Google 15 Microsoft 8
23.5K Views
Medium Frequency
~25 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen