Game Play Analysis V - Problem
Game Play Analysis V is an advanced SQL problem that challenges you to calculate player retention rates for a gaming platform.

You're given an Activity table that tracks player login sessions:

Column NameType
player_idint
device_idint
event_datedate
games_playedint

Key Concepts:
โ€ข Install date: The first day a player ever logged in
โ€ข Day one retention: Percentage of players who return the day after installing

Your Mission: For each install date, calculate:
1. Total number of players who installed on that date
2. Day one retention rate (rounded to 2 decimal places)

This metric is crucial for game companies to measure user engagement and optimize their onboarding experience!

Input & Output

example_1.sql โ€” Basic Retention Analysis
$ Input: Activity table: +----------+-----------+------------+--------------+ | player_id| device_id | event_date | games_played | +----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-03-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-01 | 0 | | 3 | 4 | 2016-07-03 | 5 | +----------+-----------+------------+--------------+
โ€บ Output: +------------+---------+----------------+ | install_date| installs| day1_retention | +------------+---------+----------------+ | 2016-03-01 | 2 | 0.50 | | 2016-07-03 | 1 | 0.00 | | 2017-06-25 | 1 | 0.00 | +------------+---------+----------------+
๐Ÿ’ก Note: On 2016-03-01: 2 players installed (players 1 and 3). Only player 1 returned on 2016-03-02, so retention = 1/2 = 0.50. Other install dates had 0% retention as no players returned the next day.
example_2.sql โ€” Perfect Retention
$ Input: Activity table: +----------+-----------+------------+--------------+ | player_id| device_id | event_date | games_played | +----------+-----------+------------+--------------+ | 1 | 1 | 2020-01-01 | 3 | | 1 | 1 | 2020-01-02 | 2 | | 2 | 2 | 2020-01-01 | 1 | | 2 | 2 | 2020-01-02 | 4 | +----------+-----------+------------+--------------+
โ€บ Output: +------------+---------+----------------+ | install_date| installs| day1_retention | +------------+---------+----------------+ | 2020-01-01 | 2 | 1.00 | +------------+---------+----------------+
๐Ÿ’ก Note: Both players installed on 2020-01-01 and both returned on 2020-01-02, resulting in perfect 100% day-1 retention.
example_3.sql โ€” Zero Retention Edge Case
$ Input: Activity table: +----------+-----------+------------+--------------+ | player_id| device_id | event_date | games_played | +----------+-----------+------------+--------------+ | 1 | 1 | 2021-01-15 | 2 | | 2 | 2 | 2021-01-15 | 0 | | 3 | 3 | 2021-01-16 | 1 | +----------+-----------+------------+--------------+
โ€บ Output: +------------+---------+----------------+ | install_date| installs| day1_retention | +------------+---------+----------------+ | 2021-01-15 | 2 | 0.00 | | 2021-01-16 | 1 | 0.00 | +------------+---------+----------------+
๐Ÿ’ก Note: Players installed but none returned the next day. All retention rates are 0.00, showing that even players with 0 games_played count as installs.

Constraints

  • 1 โ‰ค player_id, device_id โ‰ค 104
  • event_date is a valid date between 2016-02-28 and 2021-12-31
  • 0 โ‰ค games_played โ‰ค 1000
  • The combination (player_id, event_date) is unique
  • A player's install date is their first recorded event_date

Visualization

Tap to expand
Game Retention Analysis WorkflowSTEP 1Find InstallDatesROW_NUMBER() OVER(PARTITION BY player)STEP 2Self-JoinNext DayLEFT JOIN ActivityON date + 1STEP 3CalculateRetentionCOUNT(retained)/ COUNT(total)STEP 4FormatResultsROUND(rate, 2)ORDER BY dateExample: Restaurant Customer AnalysisDay 1 CustomersAlice, Bob, CarolFirst visit: Jan 1Total installs: 3Day 2 ReturnsAlice returns โœ“Bob, Carol absent โœ—Retained: 1Retention Rate1 retained รท 3 total= 0.333...Rounded: 0.33๐ŸŽฏ Key Insight: Window functions + self-joins = optimal retention analysis
Understanding the Visualization
1
Identify Install Dates
Find each player's first login date using window functions
2
Match Next-Day Activity
Self-join to find players who returned the day after installing
3
Calculate Retention
Group by install date and compute retention percentage
4
Format Results
Round to 2 decimal places and order by install date
Key Takeaway
๐ŸŽฏ Key Insight: Use window functions to identify install dates efficiently, then self-join with date arithmetic to calculate retention rates in a single optimized query.
Asked in
Meta 45 Google 38 Amazon 32 Microsoft 28 Netflix 22
52.0K Views
High Frequency
~25 min Avg. Time
1.6K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen