Design an Array Statistics Tracker - Problem
Design an Array Statistics Tracker
You need to design a smart data structure that can dynamically track statistical measures of numbers as they are added and removed. Think of it as a real-time analytics system that maintains running statistics!
Your StatisticsTracker class should support:
- Adding numbers: Insert new values into the tracker
- Removing numbers: Remove the earliest added number (FIFO - First In, First Out)
- Computing statistics: Calculate mean, median, and mode on-demand
Statistical Definitions:
- Mean: Sum of all values รท count of values (floored to integer)
- Median: Middle value when sorted. If even count, take the larger of the two middle values
- Mode: Most frequently occurring value. If multiple modes exist, return the smallest one
The challenge is to efficiently maintain these statistics as the data changes dynamically through additions and removals.
Input & Output
example_1.py โ Basic Operations
$
Input:
["StatisticsTracker", "addNumber", "addNumber", "addNumber", "getMean", "getMedian", "getMode"]
[[], [1], [3], [1], [], [], []]
โบ
Output:
[null, null, null, null, 1, 1, 1]
๐ก Note:
Initialize tracker, add 1, 3, 1. Mean = (1+3+1)/3 = 1, Median of [1,1,3] = 1 (middle), Mode = 1 (appears twice)
example_2.py โ With Removal
$
Input:
["StatisticsTracker", "addNumber", "addNumber", "addNumber", "removeFirstAddedNumber", "getMean", "getMedian", "getMode"]
[[], [4], [2], [1], [], [], [], []]
โบ
Output:
[null, null, null, null, null, 1, 2, 1]
๐ก Note:
Add 4,2,1 then remove first (4). Remaining: [2,1]. Mean = (2+1)/2 = 1, Median of [1,2] = 2 (larger middle), Mode = 1 (lexically smaller)
example_3.py โ Edge Case Single Element
$
Input:
["StatisticsTracker", "addNumber", "getMean", "getMedian", "getMode", "removeFirstAddedNumber", "getMean"]
[[], [5], [], [], [], [], []]
โบ
Output:
[null, null, 5, 5, 5, null, 0]
๐ก Note:
Single element case: all statistics equal the element. After removal, empty tracker returns 0 for mean.
Visualization
Tap to expand
Understanding the Visualization
1
Data Ingestion
New stock prices arrive continuously and must be stored in processing order
2
Data Expiration
Old prices expire (FIFO) to maintain a sliding window of recent data
3
Real-time Analytics
Traders need instant mean, median, and mode calculations for decision making
4
Performance Optimization
Multiple data structures work together to provide sub-second response times
Key Takeaway
๐ฏ Key Insight: Instead of recalculating statistics from scratch each time, maintain multiple specialized data structures that can be updated incrementally. This transforms expensive O(n log n) operations into efficient O(log n) or O(1) operations.
Time & Space Complexity
Time Complexity
O(log n)
Add/Remove: O(log n) for sorted structure updates. Mean: O(1), Median: O(log n), Mode: O(1) amortized
โก Linearithmic
Space Complexity
O(n)
Additional space for frequency map and sorted structure, but still O(n) overall
โก Linearithmic Space
Constraints
- 1 โค number โค 105
- At most 104 calls to addNumber and removeFirstAddedNumber
- At most 104 calls to getMean, getMedian, and getMode
- removeFirstAddedNumber is called only when the array is non-empty
- Statistics methods are called only when the array is non-empty
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code