Find Trending Hashtags II - Problem
๐ฅ Find Trending Hashtags II
You're tasked with analyzing Twitter data to discover the most popular hashtags trending in February 2024! Given a table of tweets, you need to extract all hashtags from the tweet content and identify the top 3 trending hashtags based on their frequency.
Table: Tweets
| Column Name | Type |
|---|---|
| user_id | int |
| tweet_id | int |
| tweet_date | date |
| tweet | varchar |
Where tweet_id is the primary key and each row represents a unique tweet posted in February 2024.
Your Mission:
- Extract all hashtags (words starting with #) from tweet content
- Count the frequency of each hashtag
- Return the top 3 most frequent hashtags
- Order results by count (descending), then by hashtag name (descending)
Note: A single tweet can contain multiple hashtags, and each occurrence should be counted separately!
Input & Output
example_1.sql โ Basic Hashtag Counting
$
Input:
Tweets table:
| user_id | tweet_id | tweet_date | tweet |
|---------|----------|------------|---------|
| 1 | 101 | 2024-02-15 | "Loving #AI and #MachineLearning today!" |
| 2 | 102 | 2024-02-16 | "#AI is the future! #TechNews rocks" |
| 3 | 103 | 2024-02-17 | "Great day for #TechNews and #Innovation" |
โบ
Output:
| hashtag | count |
|---------|-------|
| #ai | 2 |
| #technews | 2 |
| #machinelearning | 1 |
๐ก Note:
The hashtag #AI appears in 2 tweets (converted to lowercase), #TechNews appears 2 times, and #MachineLearning appears once. Results are ordered by count DESC, then hashtag name DESC.
example_2.sql โ Multiple Hashtags Per Tweet
$
Input:
Tweets table:
| user_id | tweet_id | tweet_date | tweet |
|---------|----------|------------|---------|
| 1 | 201 | 2024-02-10 | "#SuperBowl #Sports #NFL what a game!" |
| 2 | 202 | 2024-02-11 | "Love #Sports and #Football #SuperBowl" |
| 3 | 203 | 2024-02-12 | "#SuperBowl was amazing! #Sports" |
โบ
Output:
| hashtag | count |
|---------|-------|
| #superbowl | 3 |
| #sports | 3 |
| #nfl | 1 |
๐ก Note:
Both #SuperBowl and #Sports appear 3 times each. Since counts are equal, we order by hashtag name DESC, so #superbowl comes before #sports. #NFL appears once.
example_3.sql โ Edge Case with Punctuation
$
Input:
Tweets table:
| user_id | tweet_id | tweet_date | tweet |
|---------|----------|------------|---------|
| 1 | 301 | 2024-02-20 | "Check out #AI! It's #Amazing, truly #Revolutionary." |
| 2 | 302 | 2024-02-21 | "#AI and #Amazing tech everywhere!" |
| 3 | 303 | 2024-02-22 | "The #Future is #AI powered" |
โบ
Output:
| hashtag | count |
|---------|-------|
| #ai | 3 |
| #amazing | 2 |
| #revolutionary | 1 |
๐ก Note:
Hashtags followed by punctuation (!, ., ,) are correctly extracted. #AI appears 3 times, #Amazing 2 times, and other hashtags once each.
Visualization
Tap to expand
Understanding the Visualization
1
Data Ingestion
Collect all tweets from February 2024
2
Hashtag Extraction
Use regex to find all #hashtags in tweet content
3
Frequency Analysis
Count occurrences of each unique hashtag
4
Trend Ranking
Sort by frequency and select top 3 trending topics
Key Takeaway
๐ฏ Key Insight: The optimal solution leverages database-optimized regular expressions and Common Table Expressions to extract and count hashtags in a single efficient pass, making it perfect for real-time social media analytics at scale.
Time & Space Complexity
Time Complexity
O(n)
Single pass through tweets with efficient regex matching and aggregation
โ Linear Growth
Space Complexity
O(h)
Space proportional to number of unique hashtags found
โ Linear Space
Constraints
- 1 โค number of tweets โค 104
- 1 โค tweet length โค 280 characters
- All tweet_date values are valid dates in February 2024
- Hashtags are case-insensitive and must start with # followed by alphanumeric characters
- Each tweet may contain 0 to 10 hashtags
- Hashtag names can contain letters, numbers, and underscores only
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code