Imagine you're working for a bank's analytics team, and you need to create a transaction frequency histogram to understand customer behavior patterns during visits.
You have access to two tables:
- Visits: Records every time a user visits the bank (user_id, visit_date)
- Transactions: Records every transaction made (user_id, transaction_date, amount)
Your task is to analyze how many transactions customers typically make per visit. The bank wants a complete picture showing:
- How many customers made 0 transactions during their visit
- How many customers made 1 transaction during their visit
- How many customers made 2 transactions during their visit
- And so on, up to the maximum number of transactions made by any customer in a single visit
The result should be a histogram-style table with transactions_count (0 to max) and visits_count (number of customers who made that many transactions), ordered by transaction count.
Key Challenge: You must include all values from 0 to the maximum, even if no customers made that exact number of transactions (show 0 for visits_count in such cases).
Input & Output
Visualization
Time & Space Complexity
Single pass through visits and transactions data with efficient aggregation and range generation.
Where m is the maximum transaction count - only stores the final histogram results.
Constraints
- 1 โค Number of visits โค 104
- 1 โค Number of transactions โค 104
- 1 โค user_id โค 100
- All transaction dates are guaranteed to have corresponding visit records
- The result must include all values from 0 to MAX(transactions_per_visit)