Unique Orders and Customers Per Month - Problem

You're working as a data analyst for an e-commerce platform and need to generate monthly business reports. Your task is to analyze the order patterns and customer engagement for high-value transactions.

Given a table Orders with the following structure:

Column NameType
order_idint
order_datedate
customer_idint
invoiceint

Note: order_id is unique for each row.

Your Mission: Write a SQL solution to find for each month:

  • ๐Ÿ“ฆ The number of unique orders with invoice amount > $20
  • ๐Ÿ‘ฅ The number of unique customers with invoice amount > $20

This helps the business understand both order volume and customer engagement trends for high-value transactions. The result should be grouped by month and can be returned in any order.

Input & Output

basic_example.sql โ€” Basic Monthly Analysis
$ Input: Orders table:\n| order_id | order_date | customer_id | invoice |\n|----------|------------|-------------|---------|\n| 1 | 2023-01-15 | 101 | 25 |\n| 2 | 2023-01-20 | 102 | 30 |\n| 3 | 2023-01-25 | 101 | 15 |\n| 4 | 2023-02-10 | 103 | 40 |\n| 5 | 2023-02-15 | 102 | 35 |
โ€บ Output: | month | unique_orders | unique_customers |\n|---------|---------------|------------------|\n| 2023-01 | 2 | 2 |\n| 2023-02 | 2 | 2 |
๐Ÿ’ก Note: For January 2023: Orders 1 and 2 have invoice > 20 (orders 1,2 from customers 101,102). Order 3 is excluded (invoice = 15). For February 2023: Both orders 4 and 5 qualify, representing customers 103 and 102.
duplicate_customers.sql โ€” Customer Appearing Multiple Times
$ Input: Orders table:\n| order_id | order_date | customer_id | invoice |\n|----------|------------|-------------|---------|\n| 1 | 2023-01-05 | 101 | 25 |\n| 2 | 2023-01-15 | 101 | 30 |\n| 3 | 2023-01-25 | 101 | 22 |\n| 4 | 2023-01-28 | 102 | 45 |
โ€บ Output: | month | unique_orders | unique_customers |\n|---------|---------------|------------------|\n| 2023-01 | 4 | 2 |
๐Ÿ’ก Note: All 4 orders have invoice > 20, so unique_orders = 4. However, customer 101 appears in 3 orders and customer 102 in 1 order, so unique_customers = 2 (only count each customer once per month).
edge_case_filtering.sql โ€” Edge Case with Filtering
$ Input: Orders table:\n| order_id | order_date | customer_id | invoice |\n|----------|------------|-------------|---------|\n| 1 | 2023-03-10 | 201 | 20 |\n| 2 | 2023-03-15 | 202 | 21 |\n| 3 | 2023-04-20 | 203 | 19 |\n| 4 | 2023-04-25 | 204 | 50 |
โ€บ Output: | month | unique_orders | unique_customers |\n|---------|---------------|------------------|\n| 2023-03 | 1 | 1 |\n| 2023-04 | 1 | 1 |
๐Ÿ’ก Note: Order 1 has invoice = 20 (not > 20, so excluded). Order 2 qualifies for March. Order 3 has invoice = 19 (excluded). Only order 4 qualifies for April. Note that months with no qualifying orders don't appear in results.

Constraints

  • 1 โ‰ค number of orders โ‰ค 105
  • order_date is in YYYY-MM-DD format
  • 1 โ‰ค order_id, customer_id โ‰ค 106
  • 0 โ‰ค invoice โ‰ค 104
  • All order_ids are unique
  • Results should include only months with at least one order where invoice > 20

Visualization

Tap to expand
๐Ÿ“Š Monthly High-Value Order Analysis๐Ÿ“‹ Raw OrdersOrder 1: $25 (Jan)Order 2: $15 (Jan)Order 3: $30 (Feb)๐Ÿ” Filter > $20Order 1: $25 (Jan)Order 3: $30 (Feb)$15 order filtered out๐Ÿ“… January 2023Orders: 1Customers: 1๐Ÿ“… February 2023Orders: 1Customers: 1๐ŸŽฏ SQL Query BreakdownSELECT DATE_FORMAT(order_date, '%Y-%m') as month, COUNT(DISTINCT order_id) as unique_orders, COUNT(DISTINCT customer_id) as unique_customersFROM Orders WHERE invoice > 20GROUP BY DATE_FORMAT(order_date, '%Y-%m')
Understanding the Visualization
1
Filter Premium Orders
Just like sorting receipts for orders over $20, we filter invoice > 20
2
Group by Month
Organize filtered orders into monthly piles, like sorting receipts by month
3
Count Unique Orders
For each month pile, count distinct order receipts
4
Count Unique Customers
For each month pile, count distinct customer IDs (customers may have multiple orders)
Key Takeaway
๐ŸŽฏ Key Insight: Using GROUP BY with COUNT DISTINCT in a single SQL query is far more efficient than multiple separate queries, letting the database engine optimize the aggregation and counting operations.
Asked in
Amazon 45 Google 38 Meta 32 Microsoft 28 Apple 22
42.3K Views
High Frequency
~15 min Avg. Time
1.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen