Hopper Company Queries III - Problem
Hopper Company Queries III is an advanced SQL problem that simulates analyzing ride-sharing data across sliding time windows.
You are working as a data analyst for Hopper, a ride-sharing company. The company wants to understand their business performance by analyzing average ride metrics over 3-month sliding windows throughout 2020.
Your Task:
Calculate the
Key Requirements:
• Each 3-month window represents consecutive months (Jan-Mar, Feb-Apr, Mar-May, etc.)
• The averages are calculated by summing the total values from all three months and dividing by 3
• Round results to 2 decimal places
• Only consider accepted rides that actually happened
• Return results ordered by the starting month number (1=January, 2=February, etc.)
This problem tests your ability to work with temporal data aggregation, window functions, and complex multi-table joins in SQL.
You are working as a data analyst for Hopper, a ride-sharing company. The company wants to understand their business performance by analyzing average ride metrics over 3-month sliding windows throughout 2020.
Your Task:
Calculate the
average_ride_distance and average_ride_duration for every 3-month window from January-March 2020 to October-December 2020.Key Requirements:
• Each 3-month window represents consecutive months (Jan-Mar, Feb-Apr, Mar-May, etc.)
• The averages are calculated by summing the total values from all three months and dividing by 3
• Round results to 2 decimal places
• Only consider accepted rides that actually happened
• Return results ordered by the starting month number (1=January, 2=February, etc.)
This problem tests your ability to work with temporal data aggregation, window functions, and complex multi-table joins in SQL.
Input & Output
Basic Example
$
Input:
Drivers: [(1,'2019-08-01'),(2,'2020-06-01'),(3,'2020-12-01')]
Rides: [(1,1,'2020-01-08'),(2,2,'2020-02-07'),(3,3,'2020-03-04'),(4,4,'2020-04-24'),(5,1,'2020-05-21')]
AcceptedRides: [(1,1,100,10),(2,2,200,20),(3,3,150,15),(4,2,300,30),(5,1,120,12)]
›
Output:
[(1,190.00,18.33),(2,216.67,21.67),(3,256.67,25.67)]
💡 Note:
Window 1 (Jan-Mar): (100+200+150)/3 = 150.00 distance, (10+20+15)/3 = 15.00 duration. Window 2 (Feb-Apr): (200+150+300)/3 = 216.67 distance, (20+15+30)/3 = 21.67 duration.
Missing Months
$
Input:
Drivers: [(1,'2019-08-01')]
Rides: [(1,1,'2020-01-01'),(2,2,'2020-03-01')]
AcceptedRides: [(1,1,100,10),(2,1,300,30)]
›
Output:
[(1,133.33,13.33),(2,133.33,13.33),(3,100.00,10.00)]
💡 Note:
February has no rides, so it contributes 0 to the calculations. Windows are calculated with the missing month as 0.
Edge Case - No Accepted Rides
$
Input:
Drivers: [(1,'2020-01-01')]
Rides: [(1,1,'2020-01-01'),(2,1,'2020-02-01')]
AcceptedRides: []
›
Output:
[(1,0.00,0.00),(2,0.00,0.00),...,(10,0.00,0.00)]
💡 Note:
When no rides are accepted, all windows show 0.00 for both distance and duration averages.
Visualization
Tap to expand
Understanding the Visualization
1
Data Collection
Join tables and aggregate monthly ride statistics
2
Window Creation
Use SQL window functions to create overlapping 3-month periods
3
Average Calculation
Sum values across each window and divide by 3
Key Takeaway
🎯 Key Insight: SQL window functions with ROWS BETWEEN enable efficient sliding window calculations, eliminating the need for multiple subqueries while maintaining optimal performance.
Time & Space Complexity
Time Complexity
O(n log n)
Where n is number of rides. Requires sorting for window function, plus aggregation
⚡ Linearithmic
Space Complexity
O(m)
Where m is number of months (12). Stores monthly aggregates
✓ Linear Space
Constraints
- 1 ≤ drivers table rows ≤ 50
- 1 ≤ rides table rows ≤ 104
- 1 ≤ accepted_rides table rows ≤ 104
- All dates are within the year 2020
- ride_distance and ride_duration are positive integers
- Each accepted ride exists in the rides table
- Results must be rounded to exactly 2 decimal places
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code