Game Play Analysis IV - Problem

Imagine you're a data analyst at a gaming company trying to measure player retention! ๐ŸŽฎ

You have an Activity table that tracks every time players log in and play games. Your mission is to calculate the fraction of players who came back the very next day after their first login.

Table Structure:

Column NameType
player_idint
device_idint
event_datedate
games_playedint

The (player_id, event_date) combination is unique, meaning each player can only have one record per day.

Your Goal: Find what percentage of players logged in on consecutive days starting from their very first login. Return this as a decimal rounded to 2 places.

Example: If 100 players joined your game, and 33 of them came back the next day, your answer would be 0.33.

Input & Output

example_1.py โ€” Basic Case
$ Input: Activity table: | player_id | device_id | event_date | games_played | |-----------|-----------|------------|-------------| | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-03-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-01 | 0 | | 3 | 4 | 2016-07-03 | 5 |
โ€บ Output: 0.33
๐Ÿ’ก Note: Player 1 first logged in on 2016-03-01 and came back on 2016-03-02 (consecutive). Player 2 first logged in on 2017-06-25 and never returned. Player 3 first logged in on 2016-03-01 but next login was 2016-07-03 (not consecutive). Only 1 out of 3 players logged in consecutively, so 1/3 = 0.33.
example_2.py โ€” All Consecutive
$ Input: Activity table: | player_id | device_id | event_date | games_played | |-----------|-----------|------------|-------------| | 1 | 1 | 2020-01-01 | 3 | | 1 | 1 | 2020-01-02 | 4 | | 2 | 2 | 2020-01-05 | 2 | | 2 | 2 | 2020-01-06 | 1 |
โ€บ Output: 1.00
๐Ÿ’ก Note: Player 1 first login: 2020-01-01, second login: 2020-01-02 (consecutive โœ“). Player 2 first login: 2020-01-05, second login: 2020-01-06 (consecutive โœ“). Both players logged in consecutively: 2/2 = 1.00.
example_3.py โ€” No Consecutive Logins
$ Input: Activity table: | player_id | device_id | event_date | games_played | |-----------|-----------|------------|-------------| | 1 | 1 | 2021-01-01 | 1 | | 2 | 2 | 2021-01-02 | 2 | | 3 | 3 | 2021-01-03 | 1 |
โ€บ Output: 0.00
๐Ÿ’ก Note: Each player only has one login record, so none of them logged in on consecutive days after their first login. Result: 0/3 = 0.00.

Visualization

Tap to expand
Game Player Retention AnalysisMeasuring Day-1 Retention Rate๐Ÿ“… Player 1First: Jan 1Second: Jan 2โœ“๐Ÿ“… Player 2First: Jan 5Second: Jan 10โœ—๐Ÿ“… Player 3First: Jan 3Second: Jan 4โœ“Retention CalculationConsecutive Day Players: 2Total Players: 3Result: 2/3 = 0.67 (67%)๐Ÿ’ก Business Insight67% Day-1 retention is excellent!
Understanding the Visualization
1
Identify First Visits
Find each player's earliest login date using MIN() or ROW_NUMBER()
2
Check Next Day Returns
For each first visit, verify if the player logged in exactly one day later
3
Calculate Retention Rate
Divide consecutive-day players by total unique players
4
Format Result
Round to 2 decimal places for presentation
Key Takeaway
๐ŸŽฏ Key Insight: Use window functions to efficiently rank player logins, then self-join to identify consecutive day patterns. This approach scales well and leverages database optimizations for the best performance.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n log n)

Single pass with window function and join, optimized by database engine

n
2n
โšก Linearithmic
Space Complexity
O(n)

Temporary space for window function computation

n
2n
โšก Linearithmic Space

Constraints

  • 1 โ‰ค Activity table rows โ‰ค 104
  • 1 โ‰ค player_id, device_id โ‰ค 104
  • event_date is in YYYY-MM-DD format
  • 0 โ‰ค games_played โ‰ค 1000
  • (player_id, event_date) is the primary key of the table
Asked in
Meta 45 Amazon 38 Google 32 Microsoft 28
89.2K Views
High Frequency
~18 min Avg. Time
2.3K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen