Find Median from Data Stream - Problem

Imagine you're processing a continuous stream of numbers from a data source, and you need to efficiently find the median at any point in time. The median is the middle value when numbers are arranged in sorted order.

Key Points:

  • For odd-sized arrays: median is the middle element
  • For even-sized arrays: median is the average of two middle elements

Examples:

  • [2,3,4] โ†’ median = 3
  • [2,3] โ†’ median = (2 + 3) / 2 = 2.5

Your Task: Design a MedianFinder class that can:

  1. addNum(int num) - Add a number to the data stream
  2. findMedian() - Return the current median of all numbers seen so far

This problem tests your ability to maintain sorted order efficiently while processing streaming data - a common challenge in real-time systems!

Input & Output

example_1.py โ€” Basic Usage
$ Input: MedianFinder mf = new MedianFinder(); mf.addNum(1); mf.findMedian(); // return 1.0 mf.addNum(2); mf.findMedian(); // return 1.5
โ€บ Output: [1.0, 1.5]
๐Ÿ’ก Note: After adding 1, the median is 1.0. After adding 2, we have [1,2] so median is (1+2)/2 = 1.5.
example_2.py โ€” Multiple Operations
$ Input: mf.addNum(3); mf.findMedian(); // return 2.0 mf.addNum(4); mf.findMedian(); // return 2.5
โ€บ Output: [2.0, 2.5]
๐Ÿ’ก Note: After adding 3, we have [1,2,3] so median is 2.0. After adding 4, we have [1,2,3,4] so median is (2+3)/2 = 2.5.
example_3.py โ€” Negative Numbers
$ Input: mf.addNum(-1); mf.addNum(-2); mf.findMedian(); // return -1.5
โ€บ Output: [-1.5]
๐Ÿ’ก Note: With numbers [-2, -1], the median is (-2 + -1) / 2 = -1.5.

Visualization

Tap to expand
Two Heaps: The Balanced Scale SolutionMAX HEAP(Lower Half)251TOP = 5 (MAX)MIN HEAP(Upper Half)9712TOP = 7 (MIN)โšก MEDIAN CALCULATION โšกMax Heap Top: 5, Min Heap Top: 7Median = (5 + 7) / 2 = 6.0
Understanding the Visualization
1
Setup Two Heaps
Create max heap for lower half, min heap for upper half
2
Add & Balance
Insert numbers maintaining heap balance (size difference โ‰ค 1)
3
Quick Median
Median is always at heap tops - constant time access!
Key Takeaway
๐ŸŽฏ Key Insight: Two balanced heaps maintain the median position without full sorting - like having two smart assistants who always know where the middle is!

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(log n) per operation

Heap operations (insert/extract) take O(log n) time

n
2n
โšก Linearithmic
Space Complexity
O(n)

Store all n numbers across two heaps

n
2n
โšก Linearithmic Space

Constraints

  • -105 โ‰ค num โ‰ค 105
  • There will be at least one element in the data structure before calling findMedian
  • At most 5 ร— 104 calls will be made to addNum and findMedian
  • Follow up: If all integer numbers from the stream are in the range [0, 100], how would you optimize it? If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize it?
Asked in
Google 47 Amazon 35 Meta 28 Microsoft 22
42.9K Views
High Frequency
~25 min Avg. Time
1.3K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen