Statistics from a Large Sample - Problem

You are given a large sample of integers in the range [0, 255]. Since the sample is so large, it is represented by an array count where count[k] is the number of times that k appears in the sample.

Calculate the following statistics:

  • minimum: The minimum element in the sample.
  • maximum: The maximum element in the sample.
  • mean: The average of the sample, calculated as the total sum of all elements divided by the total number of elements.
  • median: If the sample has an odd number of elements, then the median is the middle element once the sample is sorted. If the sample has an even number of elements, then the median is the average of the two middle elements once the sample is sorted.
  • mode: The number that appears the most in the sample. It is guaranteed to be unique.

Return the statistics of the sample as an array of floating-point numbers [minimum, maximum, mean, median, mode].

Answers within 10-5 of the actual answer will be accepted.

Input & Output

Example 1 — Basic Case
$ Input: count = [0,1,3,4]
Output: [1.0,3.0,2.375,2.5,3.0]
💡 Note: Sample represents: value 1 appears 1 time, value 2 appears 3 times, value 3 appears 4 times. Min=1, Max=3, Mean=(1×1+2×3+3×4)/8=19/8=2.375, Median of [1,2,2,2,3,3,3,3] is (2+3)/2=2.5, Mode=3 (appears 4 times)
Example 2 — Single Value
$ Input: count = [0,4,0,0,0,0]
Output: [1.0,1.0,1.0,1.0,1.0]
💡 Note: Only value 1 appears 4 times. All statistics equal 1.0: min=1, max=1, mean=1, median=1, mode=1
Example 3 — Wider Range
$ Input: count = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Output: [100.0,140.0,123.33333333333333,125.0,140.0]
💡 Note: Values 100 (freq 1), 110 (freq 2), 140 (freq 3). Sample: [100,110,110,140,140,140]. Min=100, Max=140, Mean=(100+220+420)/6=740/6≈123.33, Median of 6 elements is (110+140)/2=125.0, Mode=140

Constraints

  • count.length == 256
  • 0 ≤ count[i] ≤ 109
  • 1 ≤ sum(count) ≤ 109
  • The mode is guaranteed to be unique

Visualization

Tap to expand
Statistics from a Large Sample INPUT count array (index = value) idx: 0 idx: 1 idx: 2 idx: 3 0 1 3 4 Frequency Distribution: 0 1x 1 3x 2 4x 3 Expanded sample: [1, 2, 2, 2, 3, 3, 3, 3] Total: 8 elements ALGORITHM STEPS 1 Find Min/Max First/last non-zero index min=1, max=3 2 Calculate Mean Sum(i * count[i]) / total (1+6+12)/8 = 2.375 3 Find Median Middle element(s) position pos 4,5 ---> (2+3)/2=2.5 4 Find Mode Index with max count count[3]=4 (highest) Cumulative Count: idx 0: 0 elements idx 1: 1 element (pos 1) idx 2: 3 elements (pos 2-4) idx 3: 4 elements (pos 5-8) FINAL RESULT Minimum 1.0 Maximum 3.0 Mean 2.375 Median 2.5 Mode 3.0 Output Array: [1.0, 3.0, 2.375, 2.5, 3.0] OK - All Stats Found O(256) time complexity Single pass for each stat Key Insight: Direct Frequency Processing The count array acts as a compressed representation - index = value, count[index] = frequency. We process statistics directly without expanding the sample. For median, track cumulative counts to find positions (n/2) and (n/2+1). This gives O(256) time regardless of sample size! TutorialsPoint - Statistics from a Large Sample | Direct Frequency Processing Approach
Asked in
Google 35 Amazon 28 Microsoft 22 Facebook 18
28.0K Views
Medium Frequency
~25 min Avg. Time
1.1K Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen