Analyze User Website Visit Pattern - Problem

Imagine you're a data analyst at a major tech company, tasked with understanding user behavior patterns on your platform. You have access to user browsing logs and need to identify the most popular 3-website sequences that users follow.

Given three arrays of equal length:

  • username - Array of usernames
  • website - Array of websites visited
  • timestamp - Array of visit timestamps

Each triplet [username[i], website[i], timestamp[i]] represents a single visit event.

Your task is to find the most popular 3-website pattern - a sequence of exactly 3 websites that users visit in chronological order (not necessarily consecutively). The pattern's score is the number of unique users who visited all 3 websites in that exact order.

Goal: Return the pattern with the highest score. If there's a tie, return the lexicographically smallest pattern.

Example: If pattern ["home", "about", "career"] has score 5, it means 5 different users visited "home", then later "about", then later "career" (with possible other visits in between).

Input & Output

example_1.py — Basic Pattern Analysis
$ Input: username = ["joe","joe","joe","james","james","james","james","mary","mary","mary"] timestamp = [1,2,3,4,5,6,7,8,9,10] website = ["home","about","career","home","cart","maps","home","home","about","career"]
Output: ["home","about","career"]
💡 Note: Three users (joe, mary, and implicitly others) followed the pattern [home→about→career]. User joe visited: home(1)→about(2)→career(3). User mary visited: home(8)→about(9)→career(10). This pattern appears more frequently than any other 3-website sequence.
example_2.py — Lexicographic Ordering
$ Input: username = ["ua","ua","ua","ub","ub","ub"] timestamp = [1,2,3,4,5,6] website = ["a","b","a","a","b","c"]
Output: ["a","b","a"]
💡 Note: User ua visits: a(1)→b(2)→a(3), generating pattern [a,b,a]. User ub visits: a(4)→b(5)→c(6), generating pattern [a,b,c]. Both patterns have count 1, but [a,b,a] is lexicographically smaller than [a,b,c], so we return [a,b,a].
example_3.py — Multiple Valid Patterns Per User
$ Input: username = ["alice","alice","alice","alice"] timestamp = [1,2,3,4] website = ["home","shop","blog","news"]
Output: ["blog","home","news"] or similar
💡 Note: Single user alice generates multiple patterns: [home,shop,blog], [home,shop,news], [home,blog,news], [shop,blog,news]. Each has count 1. The lexicographically smallest among these is returned. Note: this depends on the actual lexicographic comparison of all generated patterns.

Visualization

Tap to expand
Website Visit Pattern Analysis Pipeline📊 Raw Datajoe, 1, homejoe, 2, aboutmary, 8, home🔄 Sort & Groupjoe: [home, about, career]mary: [home, about, career]james: [home, cart, maps]🎯 Generatejoe: {[h,a,c]}mary: {[h,a,c]}james: {[h,c,m]}🏆 Count[h,a,c]: 2 users[h,c,m]: 1 userWinner: [h,a,c]💡 Key Insights:• Must preserve chronological order within each user's visits• Use sets to avoid counting duplicate patterns per user• Lexicographic ordering breaks ties between equally frequent patterns• Pattern can have repeated websites (e.g., [home, shop, home])🔍 Example WalkthroughInput: joe visits home(1)→about(2)→career(3), mary visits home(8)→about(9)→career(10)Process: Both users generate pattern [home,about,career]Output: [home,about,career] with count=2 (highest frequency)
Understanding the Visualization
1
Collect Raw Data
Gather all visit logs: user, timestamp, website
2
Sort & Group
Sort by time, group visits per user to maintain chronological order
3
Generate Patterns
For each user, create all possible 3-website combinations
4
Count & Rank
Count pattern frequencies across all users, select winner
Key Takeaway
🎯 Key Insight: Efficient pattern analysis requires careful ordering and deduplication - group by user first, then systematically generate combinations while avoiding double-counting patterns per user.

Time & Space Complexity

Time Complexity
⏱️
O(N log N + N × V³)

N log N for sorting visits, then N users × V³ combinations per user

n
2n
Linearithmic
Space Complexity
O(N × V + P)

N×V for storing grouped visits, P for pattern frequency map

n
2n
Linear Space

Constraints

  • 3 ≤ username.length ≤ 50
  • 1 ≤ username[i].length ≤ 10
  • timestamp.length == username.length
  • 1 ≤ timestamp[i] ≤ 109
  • website.length == username.length
  • 1 ≤ website[i].length ≤ 10
  • username[i] and website[i] consist of lowercase English letters
  • It is guaranteed that there is at least one user who visited at least 3 websites
  • All the tuples [username[i], timestamp[i], website[i]] are unique
Asked in
Amazon 45 Google 38 Meta 32 Microsoft 25 Apple 18
24.8K Views
High Frequency
~25 min Avg. Time
856 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen