Number of Transactions per Visit - Problem

Imagine you're working for a bank's analytics team, and you need to create a transaction frequency histogram to understand customer behavior patterns during visits.

You have access to two tables:

  • Visits: Records every time a user visits the bank (user_id, visit_date)
  • Transactions: Records every transaction made (user_id, transaction_date, amount)

Your task is to analyze how many transactions customers typically make per visit. The bank wants a complete picture showing:

  • How many customers made 0 transactions during their visit
  • How many customers made 1 transaction during their visit
  • How many customers made 2 transactions during their visit
  • And so on, up to the maximum number of transactions made by any customer in a single visit

The result should be a histogram-style table with transactions_count (0 to max) and visits_count (number of customers who made that many transactions), ordered by transaction count.

Key Challenge: You must include all values from 0 to the maximum, even if no customers made that exact number of transactions (show 0 for visits_count in such cases).

Input & Output

example_1.sql โ€” Basic Case
$ Input: Visits: | user_id | visit_date | |---------|------------| | 1 | 2020-01-01 | | 2 | 2020-01-02 | | 12 | 2020-01-01 | | 19 | 2020-01-03 | | 1 | 2020-01-02 | | 2 | 2020-01-03 | | 1 | 2020-01-04 | Transactions: | user_id | transaction_date | amount | |---------|------------------|--------| | 1 | 2020-01-02 | 120 | | 2 | 2020-01-03 | 22 | | 2 | 2020-01-03 | 33 | | 19 | 2020-01-03 | 10 |
โ€บ Output: | transactions_count | visits_count | |--------------------|---------------| | 0 | 4 | | 1 | 2 | | 2 | 1 |
๐Ÿ’ก Note: User 1 has 3 visits: 01-01 (0 trans), 01-02 (1 trans), 01-04 (0 trans). User 2 has 2 visits: 01-02 (0 trans), 01-03 (2 trans). User 12 has 1 visit: 01-01 (0 trans). User 19 has 1 visit: 01-03 (1 trans). So we have 4 visits with 0 transactions, 2 visits with 1 transaction, and 1 visit with 2 transactions.
example_2.sql โ€” All Zero Transactions
$ Input: Visits: | user_id | visit_date | |---------|------------| | 1 | 2020-01-01 | | 2 | 2020-01-02 | | 3 | 2020-01-03 | Transactions: (empty table)
โ€บ Output: | transactions_count | visits_count | |--------------------|---------------| | 0 | 3 |
๐Ÿ’ก Note: All 3 users visited but made no transactions, so all visits have 0 transactions. The result only shows transactions_count = 0 with visits_count = 3.
example_3.sql โ€” High Transaction Count
$ Input: Visits: | user_id | visit_date | |---------|------------| | 1 | 2020-01-01 | | 1 | 2020-01-02 | Transactions: | user_id | transaction_date | amount | |---------|------------------|--------| | 1 | 2020-01-01 | 100 | | 1 | 2020-01-01 | 200 | | 1 | 2020-01-01 | 300 | | 1 | 2020-01-01 | 400 | | 1 | 2020-01-01 | 500 |
โ€บ Output: | transactions_count | visits_count | |--------------------|---------------| | 0 | 1 | | 1 | 0 | | 2 | 0 | | 3 | 0 | | 4 | 0 | | 5 | 1 |
๐Ÿ’ก Note: User 1 made 2 visits: 01-01 with 5 transactions and 01-02 with 0 transactions. The result must show all counts from 0 to 5, even though counts 1-4 have zero visits.

Visualization

Tap to expand
Bank Transaction Frequency AnalysisCustomer Visits๐Ÿ‘ค User 1: Jan 1, Jan 2๐Ÿ‘ค User 2: Jan 1, Jan 3๐Ÿ‘ค User 3: Jan 2Transactions๐Ÿ’ณ User 1, Jan 1: $100๐Ÿ’ณ User 2, Jan 1: $50, $75๐Ÿ’ณ User 2, Jan 3: $200Transactions per VisitUser 1: Jan 1 (1), Jan 2 (0)User 2: Jan 1 (2), Jan 3 (1)User 3: Jan 2 (0)Final Histogram0 trans2 visits1 trans2 visits2 trans1 visitโœจ Complete histogram from 0 to MAX transactionsKey Challenge: Include ALL counts even if zero visits occurred
Understanding the Visualization
1
Collect Visit Data
Every customer visit is recorded with date
2
Track Transactions
Each transaction links to a specific visit
3
Count Per Visit
Group transactions by user and visit date
4
Create Histogram
Generate complete frequency distribution
5
Fill Gaps
Include all counts from 0 to maximum, even if zero visits
Key Takeaway
๐ŸŽฏ Key Insight: Use CTEs with LEFT JOIN and range generation to create a complete histogram without gaps, ensuring every possible transaction count from 0 to maximum is represented in the final result.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n)

Single pass through visits and transactions data with efficient aggregation and range generation.

n
2n
โœ“ Linear Growth
Space Complexity
O(m)

Where m is the maximum transaction count - only stores the final histogram results.

n
2n
โœ“ Linear Space

Constraints

  • 1 โ‰ค Number of visits โ‰ค 104
  • 1 โ‰ค Number of transactions โ‰ค 104
  • 1 โ‰ค user_id โ‰ค 100
  • All transaction dates are guaranteed to have corresponding visit records
  • The result must include all values from 0 to MAX(transactions_per_visit)
Asked in
Amazon 42 Google 38 Meta 35 Microsoft 29 Goldman Sachs 25
91.2K Views
High Frequency
~35 min Avg. Time
2.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen