Reservoir Sampling - Problem

Reservoir sampling is a classic algorithm used to randomly select k items from a stream of data where the total number of items is unknown or very large. The challenge is maintaining a uniform probability for each item to be selected without knowing how many items will come next.

Your task is to implement a reservoir sampling algorithm that processes a stream of integers and returns exactly k randomly selected items with equal probability for each item in the original stream.

The algorithm should work as follows:

  • Initialize a reservoir array of size k with the first k items from the stream
  • For each subsequent item (index i ≥ k), generate a random number between 0 and i (inclusive)
  • If the random number is less than k, replace the item at that position in the reservoir

Note: For testing purposes, use a simple random number generator based on a seed to ensure reproducible results.

Input & Output

Example 1 — Basic Stream
$ Input: stream = [1,2,3,4,5,6], k = 3
Output: [1,2,6]
💡 Note: Reservoir sampling selects 3 items from the stream. Initially fills reservoir with [1,2,3]. When processing item 4 at index 3, generates random number in range [0,3]. If random < 3, replaces reservoir item. After processing all items, returns 3 selected elements.
Example 2 — Small k
$ Input: stream = [10,20,30,40,50], k = 2
Output: [10,50]
💡 Note: Selects 2 items from 5-element stream. Starts with reservoir [10,20], then processes remaining items [30,40,50]. Each new item has probability k/(current_position+1) of replacing an existing reservoir item.
Example 3 — k equals stream length
$ Input: stream = [7,8,9], k = 3
Output: [7,8,9]
💡 Note: When k equals or exceeds stream length, algorithm returns all stream elements since we can select everything without any sampling needed.

Constraints

  • 1 ≤ stream.length ≤ 104
  • 1 ≤ k ≤ stream.length
  • -106 ≤ stream[i] ≤ 106

Visualization

Tap to expand
INPUTALGORITHMRESULTStream: [1,2,3,4,5,6], k=3123456Unknown stream lengthNeed k=3 random samples1234Fill reservoir[1,2,3]Item 4: rand(0,3)=2, replaceItem 5: rand(0,4)=4, skipItem 6: rand(0,5)=2, replaceProbability decreases asstream length increasesFinal Sample[1, 2, 6]3 items selectedEach original item hadequal 3/6 = 50% chanceof being selected💡Key Insight:Reservoir sampling maintains uniform selection probability by replacing items withdecreasing probability (k/i) as stream progresses, requiring only O(k) space.TutorialsPoint - Reservoir Sampling | Dynamic Probability Selection
Asked in
Google 45 Amazon 38 Microsoft 32 Meta 28
28.5K Views
Medium Frequency
~35 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen