Find Trending Hashtags II - Problem

๐Ÿ”ฅ Find Trending Hashtags II

You're tasked with analyzing Twitter data to discover the most popular hashtags trending in February 2024! Given a table of tweets, you need to extract all hashtags from the tweet content and identify the top 3 trending hashtags based on their frequency.

Table: Tweets

Column NameType
user_idint
tweet_idint
tweet_datedate
tweetvarchar

Where tweet_id is the primary key and each row represents a unique tweet posted in February 2024.

Your Mission:

  • Extract all hashtags (words starting with #) from tweet content
  • Count the frequency of each hashtag
  • Return the top 3 most frequent hashtags
  • Order results by count (descending), then by hashtag name (descending)

Note: A single tweet can contain multiple hashtags, and each occurrence should be counted separately!

Input & Output

example_1.sql โ€” Basic Hashtag Counting
$ Input: Tweets table: | user_id | tweet_id | tweet_date | tweet | |---------|----------|------------|---------| | 1 | 101 | 2024-02-15 | "Loving #AI and #MachineLearning today!" | | 2 | 102 | 2024-02-16 | "#AI is the future! #TechNews rocks" | | 3 | 103 | 2024-02-17 | "Great day for #TechNews and #Innovation" |
โ€บ Output: | hashtag | count | |---------|-------| | #ai | 2 | | #technews | 2 | | #machinelearning | 1 |
๐Ÿ’ก Note: The hashtag #AI appears in 2 tweets (converted to lowercase), #TechNews appears 2 times, and #MachineLearning appears once. Results are ordered by count DESC, then hashtag name DESC.
example_2.sql โ€” Multiple Hashtags Per Tweet
$ Input: Tweets table: | user_id | tweet_id | tweet_date | tweet | |---------|----------|------------|---------| | 1 | 201 | 2024-02-10 | "#SuperBowl #Sports #NFL what a game!" | | 2 | 202 | 2024-02-11 | "Love #Sports and #Football #SuperBowl" | | 3 | 203 | 2024-02-12 | "#SuperBowl was amazing! #Sports" |
โ€บ Output: | hashtag | count | |---------|-------| | #superbowl | 3 | | #sports | 3 | | #nfl | 1 |
๐Ÿ’ก Note: Both #SuperBowl and #Sports appear 3 times each. Since counts are equal, we order by hashtag name DESC, so #superbowl comes before #sports. #NFL appears once.
example_3.sql โ€” Edge Case with Punctuation
$ Input: Tweets table: | user_id | tweet_id | tweet_date | tweet | |---------|----------|------------|---------| | 1 | 301 | 2024-02-20 | "Check out #AI! It's #Amazing, truly #Revolutionary." | | 2 | 302 | 2024-02-21 | "#AI and #Amazing tech everywhere!" | | 3 | 303 | 2024-02-22 | "The #Future is #AI powered" |
โ€บ Output: | hashtag | count | |---------|-------| | #ai | 3 | | #amazing | 2 | | #revolutionary | 1 |
๐Ÿ’ก Note: Hashtags followed by punctuation (!, ., ,) are correctly extracted. #AI appears 3 times, #Amazing 2 times, and other hashtags once each.

Visualization

Tap to expand
๐Ÿฆ Twitter Trending Hashtags Analysis๐Ÿ“ฅ TWEETS"Love #AI today!""#AI #ML rocks""#TechNews here"๐Ÿ” EXTRACTREGEXP_EXTRACT_ALL#AI, #AI, #ML#TechNews๐Ÿ“Š COUNT#AI: 2#ML: 1#TechNews: 1๐Ÿ† TOP 3 TRENDING1. #AI (2 mentions)2. #TechNews (1 mention)3. #ML (1 mention)๐Ÿ’ก Key Insight: Single SQL query with regex extraction + aggregation
Understanding the Visualization
1
Data Ingestion
Collect all tweets from February 2024
2
Hashtag Extraction
Use regex to find all #hashtags in tweet content
3
Frequency Analysis
Count occurrences of each unique hashtag
4
Trend Ranking
Sort by frequency and select top 3 trending topics
Key Takeaway
๐ŸŽฏ Key Insight: The optimal solution leverages database-optimized regular expressions and Common Table Expressions to extract and count hashtags in a single efficient pass, making it perfect for real-time social media analytics at scale.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n)

Single pass through tweets with efficient regex matching and aggregation

n
2n
โœ“ Linear Growth
Space Complexity
O(h)

Space proportional to number of unique hashtags found

n
2n
โœ“ Linear Space

Constraints

  • 1 โ‰ค number of tweets โ‰ค 104
  • 1 โ‰ค tweet length โ‰ค 280 characters
  • All tweet_date values are valid dates in February 2024
  • Hashtags are case-insensitive and must start with # followed by alphanumeric characters
  • Each tweet may contain 0 to 10 hashtags
  • Hashtag names can contain letters, numbers, and underscores only
Asked in
Meta 45 Twitter 38 Google 32 LinkedIn 28
38.5K Views
High Frequency
~25 min Avg. Time
1.5K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen