Active Users - Problem
Active Users Analytics
You're building a user engagement analytics system for a social media platform. Your task is to identify active users - those who demonstrate consistent engagement by logging in for 5 or more consecutive days.
Given two database tables:
Accounts Table: Contains user account information (id and name)
Logins Table: Contains login records (account id and login date)
Your goal is to write an SQL query that finds all users who have logged in for at least 5 consecutive days. Return their
Note: A user may log in multiple times per day, but we only care about the distinct login dates for consecutive day calculations.
You're building a user engagement analytics system for a social media platform. Your task is to identify active users - those who demonstrate consistent engagement by logging in for 5 or more consecutive days.
Given two database tables:
Accounts Table: Contains user account information (id and name)
Logins Table: Contains login records (account id and login date)
Your goal is to write an SQL query that finds all users who have logged in for at least 5 consecutive days. Return their
id and name, ordered by id.Note: A user may log in multiple times per day, but we only care about the distinct login dates for consecutive day calculations.
Input & Output
example_1.sql โ Basic consecutive logins
$
Input:
Accounts: [(1,'Winston'), (7,'Jonathan')]
Logins: [(7,'2020-05-30'), (1,'2020-05-30'), (7,'2020-05-31'), (7,'2020-06-01'), (7,'2020-06-02'), (7,'2020-06-03'), (7,'2020-06-04'), (1,'2020-06-01')]
โบ
Output:
[(7,'Jonathan')]
๐ก Note:
User 7 (Jonathan) logged in for 5 consecutive days from 2020-05-30 to 2020-06-03, making him an active user. User 1 (Winston) only has 2 separate login days, not meeting the 5 consecutive days requirement.
example_2.sql โ Multiple consecutive streaks
$
Input:
Accounts: [(1,'Alice'), (2,'Bob'), (3,'Charlie')]
Logins: [(1,'2023-01-01'), (1,'2023-01-02'), (1,'2023-01-03'), (1,'2023-01-04'), (1,'2023-01-05'), (2,'2023-01-01'), (2,'2023-01-03'), (2,'2023-01-05'), (3,'2023-01-10'), (3,'2023-01-11'), (3,'2023-01-12'), (3,'2023-01-13'), (3,'2023-01-14'), (3,'2023-01-15')]
โบ
Output:
[(1,'Alice'), (3,'Charlie')]
๐ก Note:
Alice has 5 consecutive days (Jan 1-5), Charlie has 6 consecutive days (Jan 10-15). Bob has non-consecutive login days, so he doesn't qualify as an active user.
example_3.sql โ Edge case with duplicates
$
Input:
Accounts: [(1,'John')]
Logins: [(1,'2023-01-01'), (1,'2023-01-01'), (1,'2023-01-02'), (1,'2023-01-03'), (1,'2023-01-04'), (1,'2023-01-05'), (1,'2023-01-05')]
โบ
Output:
[(1,'John')]
๐ก Note:
Even with duplicate login entries for the same day, John still has 5 consecutive unique login days (Jan 1-5), making him an active user. Duplicates are handled by using DISTINCT in the query.
Visualization
Tap to expand
Understanding the Visualization
1
Remove Duplicates
Get unique login dates per user (same person can't visit gym twice in one day)
2
Rank Dates
Apply DENSE_RANK to order login dates chronologically for each user
3
Group Consecutive
Calculate date groups by subtracting rank from date - consecutive dates get same group
4
Count & Filter
Count group sizes and keep users with 5+ consecutive login days
Key Takeaway
๐ฏ Key Insight: DENSE_RANK creates consistent group identifiers for consecutive date sequences, allowing efficient detection of active users in O(n log n) time
Time & Space Complexity
Time Complexity
O(n log n)
Single sort for DENSE_RANK operation, then linear grouping and filtering
โก Linearithmic
Space Complexity
O(n)
Space for intermediate results and window function processing
โก Linearithmic Space
Constraints
- 1 โค Accounts table size โค 500
- 1 โค Logins table size โค 1000
- 1 โค id โค 500
- All login_date values are valid dates
- A user may have multiple login records for the same date
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code