Find Median from Data Stream - Problem
Imagine you're processing a continuous stream of numbers from a data source, and you need to efficiently find the median at any point in time. The median is the middle value when numbers are arranged in sorted order.
Key Points:
- For odd-sized arrays: median is the middle element
- For even-sized arrays: median is the average of two middle elements
Examples:
[2,3,4]โ median =3[2,3]โ median =(2 + 3) / 2 = 2.5
Your Task: Design a MedianFinder class that can:
addNum(int num)- Add a number to the data streamfindMedian()- Return the current median of all numbers seen so far
This problem tests your ability to maintain sorted order efficiently while processing streaming data - a common challenge in real-time systems!
Input & Output
example_1.py โ Basic Usage
$
Input:
MedianFinder mf = new MedianFinder();
mf.addNum(1);
mf.findMedian(); // return 1.0
mf.addNum(2);
mf.findMedian(); // return 1.5
โบ
Output:
[1.0, 1.5]
๐ก Note:
After adding 1, the median is 1.0. After adding 2, we have [1,2] so median is (1+2)/2 = 1.5.
example_2.py โ Multiple Operations
$
Input:
mf.addNum(3);
mf.findMedian(); // return 2.0
mf.addNum(4);
mf.findMedian(); // return 2.5
โบ
Output:
[2.0, 2.5]
๐ก Note:
After adding 3, we have [1,2,3] so median is 2.0. After adding 4, we have [1,2,3,4] so median is (2+3)/2 = 2.5.
example_3.py โ Negative Numbers
$
Input:
mf.addNum(-1);
mf.addNum(-2);
mf.findMedian(); // return -1.5
โบ
Output:
[-1.5]
๐ก Note:
With numbers [-2, -1], the median is (-2 + -1) / 2 = -1.5.
Visualization
Tap to expand
Understanding the Visualization
1
Setup Two Heaps
Create max heap for lower half, min heap for upper half
2
Add & Balance
Insert numbers maintaining heap balance (size difference โค 1)
3
Quick Median
Median is always at heap tops - constant time access!
Key Takeaway
๐ฏ Key Insight: Two balanced heaps maintain the median position without full sorting - like having two smart assistants who always know where the middle is!
Time & Space Complexity
Time Complexity
O(log n) per operation
Heap operations (insert/extract) take O(log n) time
โก Linearithmic
Space Complexity
O(n)
Store all n numbers across two heaps
โก Linearithmic Space
Constraints
- -105 โค num โค 105
- There will be at least one element in the data structure before calling findMedian
- At most 5 ร 104 calls will be made to addNum and findMedian
- Follow up: If all integer numbers from the stream are in the range [0, 100], how would you optimize it? If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize it?
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code