Statistics from a Large Sample - Problem
Imagine you're analyzing a massive dataset of pixel intensities from millions of images, where each pixel value ranges from 0 (black) to 255 (white). The dataset is so large that instead of storing individual values, you have a frequency count array where count[k] represents how many times the value k appears.
Your task is to calculate five key statistical measures from this compressed representation:
- Minimum: The smallest value that appears at least once
- Maximum: The largest value that appears at least once
- Mean: The average of all values (sum of all elements รท total count)
- Median: The middle value when all elements are sorted (or average of two middle values for even counts)
- Mode: The most frequently occurring value (guaranteed to be unique)
Return these statistics as an array of floating-point numbers: [minimum, maximum, mean, median, mode]
Note: Answers within 10-5 of the actual answer will be accepted.
Input & Output
example_1.py โ Basic Case
$
Input:
count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
โบ
Output:
[1.00000, 3.00000, 2.37500, 2.50000, 3.00000]
๐ก Note:
Dataset: [1,2,2,2,3,3,3,3]. Min=1, Max=3, Mean=(1+6+12)/8=2.375, Median=(2+3)/2=2.5, Mode=3 (appears 4 times)
example_2.py โ Single Element
$
Input:
count = [0,4,3,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
โบ
Output:
[1.00000, 4.00000, 2.18182, 2.00000, 1.00000]
๐ก Note:
Dataset: [1,1,1,1,2,2,2,3,3,4,4]. Mode=1 (appears 4 times), total 11 elements so median is 6th element which is 2
example_3.py โ Edge Case
$
Input:
count = [1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1]
โบ
Output:
[0.00000, 255.00000, 127.50000, 127.50000, 0.00000]
๐ก Note:
Dataset: [0,255]. Min=0, Max=255, Mean=(0+255)/2=127.5, Median=(0+255)/2=127.5, Mode=0 (both appear once, but 0 comes first)
Constraints
- count.length == 256
- 0 โค count[i] โค 109
- 1 โค sum(count) โค 109
- It is guaranteed that mode is unique
- Answers within 10-5 of the actual answer will be accepted
Visualization
Tap to expand
Understanding the Visualization
1
Frequency Table Analysis
Extract min, max, mode, and calculate weighted sum in single pass
2
Mean Calculation
Use formula: ฮฃ(value ร frequency) รท total_count
3
Median via Cumulative Frequency
Find middle position(s) using running frequency totals
Key Takeaway
๐ฏ Key Insight: Frequency data contains all information needed for statistics - no reconstruction required! Use weighted sums for mean and cumulative frequencies for median.
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code