Users With Two Purchases Within Seven Days - Problem

Imagine you're working as a data analyst for a major e-commerce platform, and the marketing team wants to identify frequent shoppers to target with special loyalty programs. Your task is to find users who show consistent purchasing behavior by making at least two purchases within a 7-day window.

Given a Purchases table with purchase records, you need to identify all users who made any two purchases at most 7 days apart. This could help the business understand customer buying patterns and reward loyal customers.

Table: Purchases

+---------------+------+
| Column Name   | Type |
+---------------+------+
| purchase_id   | int  |
| user_id       | int  |
| purchase_date | date |
+---------------+------+

Where purchase_id contains unique values, and each row represents a purchase made by a user on a specific date.

Goal: Return the user_ids of all users who made any two purchases with at most 7 days between them, ordered by user_id.

Input & Output

example_1.sql โ€” Basic Case
$ Input: Purchases table:\n| purchase_id | user_id | purchase_date |\n|-------------|---------|---------------|\n| 4 | 2 | 2022-03-13 |\n| 1 | 1 | 2022-02-11 |\n| 2 | 1 | 2022-02-17 |\n| 6 | 3 | 2022-03-13 |\n| 5 | 3 | 2022-03-20 |\n| 3 | 1 | 2022-02-24 |
โ€บ Output: | user_id |\n|---------|\n| 1 |\n| 3 |
๐Ÿ’ก Note: User 1 has purchases on 2022-02-11 and 2022-02-17 (6 days apart โ‰ค 7). User 3 has purchases on 2022-03-13 and 2022-03-20 (7 days apart โ‰ค 7). User 2 has only one purchase, so doesn't qualify.
example_2.sql โ€” Edge Case: Exactly 7 Days
$ Input: Purchases table:\n| purchase_id | user_id | purchase_date |\n|-------------|---------|---------------|\n| 1 | 1 | 2023-01-01 |\n| 2 | 1 | 2023-01-08 |\n| 3 | 2 | 2023-01-01 |\n| 4 | 2 | 2023-01-09 |
โ€บ Output: | user_id |\n|---------|\n| 1 |
๐Ÿ’ก Note: User 1 has purchases exactly 7 days apart (2023-01-01 to 2023-01-08), which qualifies. User 2 has purchases 8 days apart, which exceeds the 7-day limit.
example_3.sql โ€” Multiple Valid Pairs
$ Input: Purchases table:\n| purchase_id | user_id | purchase_date |\n|-------------|---------|---------------|\n| 1 | 1 | 2023-01-01 |\n| 2 | 1 | 2023-01-03 |\n| 3 | 1 | 2023-01-05 |\n| 4 | 1 | 2023-01-20 |
โ€บ Output: | user_id |\n|---------|\n| 1 |
๐Ÿ’ก Note: User 1 appears only once in the result despite having multiple valid pairs: (Jan 1, Jan 3) = 2 days, (Jan 3, Jan 5) = 2 days. We return distinct user_ids.

Constraints

  • 1 โ‰ค Number of purchases โ‰ค 105
  • 1 โ‰ค user_id โ‰ค 104
  • purchase_date is a valid date between 2020-01-01 and 2024-12-31
  • All purchase_id values are unique
  • A user can have multiple purchases on the same date

Visualization

Tap to expand
Customer Purchase Timeline AnalysisUser 1 Timeline:Jan 1Jan 5Jan 204 days โœ…15 days โŒUser 2 Timeline:Jan 10Only one purchase - No comparison possibleUser 3 Timeline:Mar 13Mar 207 days โœ…LAG Function MagicFor each purchase, LAG gets theprevious purchase date:โ€ข User 1, Jan 5: LAG = Jan 1 (4 days)โ€ข User 1, Jan 20: LAG = Jan 5 (15 days)โ€ข User 3, Mar 20: LAG = Mar 13 (7 days)Result: Users 1 and 3 qualify!
Understanding the Visualization
1
Organize Timeline
Sort all purchases by user and date to create chronological timelines
2
Look Back
For each purchase, use LAG to see the previous purchase date for that user
3
Measure Gap
Calculate the number of days between consecutive purchases
4
Filter Loyalists
Keep users where any consecutive purchases are โ‰ค7 days apart
Key Takeaway
๐ŸŽฏ Key Insight: Instead of comparing every purchase with every other purchase (O(nยฒ)), we sort chronologically and only check consecutive purchases using LAG window function, achieving O(n log n) efficiency.
Asked in
Amazon 45 Meta 38 Google 32 Microsoft 28
28.4K Views
High Frequency
~15 min Avg. Time
1.2K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen