Hopper Company Queries I - Problem

๐Ÿš— Hopper Company Queries I - Monthly Business Analytics

You're a data analyst at Hopper, a ride-sharing company that's tracking their business growth throughout 2020. Your task is to generate a comprehensive monthly report that shows two key metrics:

  • Active Drivers: The cumulative number of drivers who have joined the company by the end of each month
  • Accepted Rides: The total number of rides that were successfully matched with drivers each month

You have access to three database tables:

Drivers Table: Contains driver information and their join dates
Rides Table: Contains all ride requests (both accepted and rejected)
AcceptedRides Table: Contains details only for rides that were successfully matched

The challenge is to handle the cumulative nature of driver counts (once a driver joins, they remain active) while calculating monthly ride statistics. You need to generate results for all 12 months of 2020, even if some months have zero activity.

Goal: Return a report ordered by month (1=January, 2=February, etc.) showing month, active_drivers, and accepted_rides for each month in 2020.

Input & Output

example_1.sql โ€” Basic Example
$ Input: Drivers: [(1,'2019-01-01'), (2,'2020-03-21'), (3,'2020-07-31')] Rides: [(1,1,'2020-01-29'), (2,2,'2020-02-29'), (3,3,'2020-07-04')] AcceptedRides: [(1,1,10,30), (3,3,7,15)]
โ€บ Output: month=1, active_drivers=1, accepted_rides=1 month=2, active_drivers=1, accepted_rides=0 month=3, active_drivers=2, accepted_rides=0 ... month=7, active_drivers=3, accepted_rides=1 ...
๐Ÿ’ก Note: Driver 1 joined before 2020, so active from month 1. Driver 2 joined in March, Driver 3 in July. Only rides 1 and 3 were accepted.
example_2.sql โ€” No Activity Month
$ Input: Drivers: [(1,'2020-01-15')] Rides: [(1,1,'2020-03-10')] AcceptedRides: []
โ€บ Output: month=1, active_drivers=1, accepted_rides=0 month=2, active_drivers=1, accepted_rides=0 month=3, active_drivers=1, accepted_rides=0 ...
๐Ÿ’ก Note: One driver joined in January, but no rides were accepted throughout 2020. Driver count remains 1 for all months.
example_3.sql โ€” Edge Case Zero Drivers
$ Input: Drivers: [] Rides: [(1,1,'2020-05-01')] AcceptedRides: [(1,1,5,10)]
โ€บ Output: month=1, active_drivers=0, accepted_rides=0 month=2, active_drivers=0, accepted_rides=0 ... month=5, active_drivers=0, accepted_rides=1 ...
๐Ÿ’ก Note: No drivers joined by 2020, so active_drivers is 0 for all months. However, ride was accepted (perhaps by drivers who joined after 2020).

Visualization

Tap to expand
๐Ÿ“Š Hopper Monthly Analytics Pipeline๐Ÿ“… MONTHSJan Feb MarApr May JunJul Aug Sep Oct Nov Dec๐Ÿ‘ฅ DRIVERSCumulative CountBy Month EndWindow Function๐Ÿš— RIDESMonthly CountAccepted OnlyGROUP BY Month๐Ÿ”— LEFT JOINCombine All CTEsHandle NULL monthsCOALESCE(count, 0)๐Ÿ“ˆ FINAL REPORTMonth | Drivers | Rides1 | 5 | 122 | 8 | 03 | 12 | 7...๐ŸŽฏ Key Insight: Separate cumulative metrics from periodic metrics using different aggregation strategies
Understanding the Visualization
1
Generate Calendar
Create a template with all 12 months to ensure no gaps in reporting
2
Calculate Cumulative Staff
Count how many drivers joined by each month-end (running total)
3
Aggregate Monthly Sales
Count accepted rides for each specific month
4
Merge Reports
Combine all data, filling zeros for inactive months
Key Takeaway
๐ŸŽฏ Key Insight: The secret is recognizing that driver counts are cumulative (running totals) while ride counts are periodic (monthly snapshots), requiring different aggregation strategies combined through efficient CTEs and LEFT JOINs.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n + m)

Where n is drivers and m is rides. Each table is scanned once with efficient aggregation.

n
2n
โœ“ Linear Growth
Space Complexity
O(12)

Space for 12 months plus intermediate aggregation results

n
2n
โœ“ Linear Space

Constraints

  • 1 โ‰ค Drivers.driver_id โ‰ค 103
  • 1980-01-01 โ‰ค Drivers.join_date โ‰ค 2020-12-31
  • 1 โ‰ค Rides.ride_id โ‰ค 105
  • 1 โ‰ค Rides.user_id โ‰ค 104
  • 2020-01-01 โ‰ค Rides.requested_at โ‰ค 2020-12-31
  • AcceptedRides.ride_id is guaranteed to exist in Rides table
  • Report must include all 12 months of 2020, even with zero activity
Asked in
Meta 45 Amazon 38 Microsoft 32 Uber 28
43.7K Views
High Frequency
~25 min Avg. Time
1.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen