Find Category Recommendation Pairs - Problem
Amazon's Category Cross-Shopping Analytics Challenge
Amazon wants to identify cross-category shopping patterns to improve their recommendation engine. Given two tables containing customer purchase data and product information, your task is to find pairs of product categories that show strong customer overlap.
Your Mission:
• Find all category pairs where
• Count unique customers who purchased from both categories in each pair
• Only report pairs with at least 3 customers shopping across both categories
Input Tables:
Output: Category pairs ordered by customer count (descending), then alphabetically by category names.
This data helps Amazon understand which product categories naturally complement each other in customer shopping behavior!
Amazon wants to identify cross-category shopping patterns to improve their recommendation engine. Given two tables containing customer purchase data and product information, your task is to find pairs of product categories that show strong customer overlap.
Your Mission:
• Find all category pairs where
category1 < category2 (alphabetically)• Count unique customers who purchased from both categories in each pair
• Only report pairs with at least 3 customers shopping across both categories
Input Tables:
ProductPurchases: Contains user purchases with quantitiesProductInfo: Maps products to categories and pricesOutput: Category pairs ordered by customer count (descending), then alphabetically by category names.
This data helps Amazon understand which product categories naturally complement each other in customer shopping behavior!
Input & Output
example_1.sql — Basic Cross-Category Analysis
$
Input:
ProductPurchases:
| user_id | product_id | quantity |
|---------|------------|----------|
| 1 | 101 | 2 |
| 1 | 201 | 1 |
| 2 | 101 | 1 |
| 2 | 301 | 3 |
| 3 | 201 | 2 |
| 3 | 301 | 1 |
| 4 | 101 | 1 |
| 4 | 201 | 2 |
| 4 | 301 | 1 |
ProductInfo:
| product_id | category | price |
|------------|-------------|-------|
| 101 | Electronics | 299.99|
| 201 | Books | 19.99 |
| 301 | Clothing | 49.99 |
›
Output:
| category1 | category2 | customer_count |
|-------------|-------------|----------------|
| Books | Electronics | 2 |
| Books | Clothing | 2 |
💡 Note:
Users 1 and 4 bought from both Books and Electronics (count=2). Users 2, 3, and 4 represent cross-category shopping, but Electronics-Clothing only has user 4 (count=1 < 3), so it's filtered out.
example_2.sql — Multiple Category Overlaps
$
Input:
ProductPurchases:
| user_id | product_id | quantity |
|---------|------------|----------|
| 1 | 101 | 1 |
| 1 | 201 | 1 |
| 1 | 301 | 1 |
| 2 | 101 | 2 |
| 2 | 201 | 1 |
| 3 | 101 | 1 |
| 3 | 301 | 2 |
| 4 | 201 | 1 |
| 4 | 301 | 1 |
| 5 | 101 | 1 |
| 5 | 201 | 2 |
| 6 | 101 | 1 |
| 6 | 301 | 1 |
ProductInfo:
| product_id | category | price |
|------------|-------------|-------|
| 101 | Electronics | 199.99|
| 201 | Books | 24.99 |
| 301 | Clothing | 79.99 |
›
Output:
| category1 | category2 | customer_count |
|-------------|-------------|----------------|
| Books | Electronics | 3 |
| Clothing | Electronics | 3 |
💡 Note:
Books-Electronics: users 1, 2, 5 (count=3). Clothing-Electronics: users 1, 3, 6 (count=3). Books-Clothing: users 1, 4 (count=2 < 3, filtered out). Results ordered by count DESC, then alphabetically.
example_3.sql — Edge Case: No Valid Pairs
$
Input:
ProductPurchases:
| user_id | product_id | quantity |
|---------|------------|----------|
| 1 | 101 | 1 |
| 2 | 201 | 1 |
| 3 | 301 | 1 |
| 4 | 101 | 2 |
| 5 | 201 | 1 |
ProductInfo:
| product_id | category | price |
|------------|-------------|-------|
| 101 | Electronics | 399.99|
| 201 | Books | 14.99 |
| 301 | Clothing | 89.99 |
›
Output:
| category1 | category2 | customer_count |
|-------------|-------------|----------------|
💡 Note:
No users purchased from multiple categories, so no category pairs have cross-shopping customers. The result is empty since no pairs meet the minimum threshold of 3 customers.
Visualization
Tap to expand
Understanding the Visualization
1
Connect Data
JOIN purchases with product info to map users to categories they've shopped
2
Find Overlaps
Self-JOIN to identify users who purchased from multiple categories
3
Group & Count
GROUP BY category pairs and COUNT distinct cross-shopping customers
4
Filter & Rank
Keep pairs with 3+ customers, sort by popularity then alphabetically
Key Takeaway
🎯 Key Insight: Self-joining user-category data efficiently reveals cross-shopping patterns without nested loops, leveraging database indexing for optimal performance.
Time & Space Complexity
Time Complexity
O(P + U×C²)
P for initial join, then U users times C² category combinations, but optimized with indexing
✓ Linear Growth
Space Complexity
O(U×C)
Space for user-category relationships and intermediate join results
✓ Linear Space
Constraints
- 1 ≤ Number of purchases ≤ 105
- 1 ≤ user_id, product_id ≤ 104
- 1 ≤ quantity ≤ 100
- 1 ≤ Number of products ≤ 104
- 1 ≤ price ≤ 104.00
- Category names contain only lowercase letters and spaces
- Each (user_id, product_id) combination is unique in ProductPurchases
- Each product_id is unique in ProductInfo
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code