Reservoir Sampling - Problem
Reservoir sampling is a classic algorithm used to randomly select k items from a stream of data where the total number of items is unknown or very large. The challenge is maintaining a uniform probability for each item to be selected without knowing how many items will come next.
Your task is to implement a reservoir sampling algorithm that processes a stream of integers and returns exactly k randomly selected items with equal probability for each item in the original stream.
The algorithm should work as follows:
- Initialize a reservoir array of size k with the first k items from the stream
- For each subsequent item (index i ≥ k), generate a random number between 0 and i (inclusive)
- If the random number is less than k, replace the item at that position in the reservoir
Note: For testing purposes, use a simple random number generator based on a seed to ensure reproducible results.
Input & Output
Example 1 — Basic Stream
$
Input:
stream = [1,2,3,4,5,6], k = 3
›
Output:
[1,2,6]
💡 Note:
Reservoir sampling selects 3 items from the stream. Initially fills reservoir with [1,2,3]. When processing item 4 at index 3, generates random number in range [0,3]. If random < 3, replaces reservoir item. After processing all items, returns 3 selected elements.
Example 2 — Small k
$
Input:
stream = [10,20,30,40,50], k = 2
›
Output:
[10,50]
💡 Note:
Selects 2 items from 5-element stream. Starts with reservoir [10,20], then processes remaining items [30,40,50]. Each new item has probability k/(current_position+1) of replacing an existing reservoir item.
Example 3 — k equals stream length
$
Input:
stream = [7,8,9], k = 3
›
Output:
[7,8,9]
💡 Note:
When k equals or exceeds stream length, algorithm returns all stream elements since we can select everything without any sampling needed.
Constraints
- 1 ≤ stream.length ≤ 104
- 1 ≤ k ≤ stream.length
- -106 ≤ stream[i] ≤ 106
Visualization
Tap to expand
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code