The Latest Login in 2020 - Problem
You're tasked with analyzing user login data for a busy web application. Given a Logins table that tracks every user login with timestamps, you need to find the most recent login for each user specifically during the year 2020.
The Logins table structure:
| Column Name | Type |
|---|---|
| user_id | int |
| time_stamp | datetime |
Important: Only include users who actually logged in during 2020 - ignore users who didn't log in that year at all.
Goal: Write a SQL query that returns each user's latest login timestamp from 2020. The results can be in any order.
Input & Output
example_1.sql โ Basic Example
$
Input:
Logins table:
| user_id | time_stamp |
|---------|--------------------|
| 6 | 2020-06-30 15:06:07|
| 6 | 2021-04-21 14:03:06|
| 6 | 2019-03-07 00:18:15|
| 8 | 2020-02-01 05:10:53|
| 8 | 2020-12-30 00:46:50|
| 2 | 2020-01-16 02:49:50|
| 2 | 2019-08-25 07:59:08|
โบ
Output:
| user_id | last_login |
|---------|--------------------|
| 6 | 2020-06-30 15:06:07|
| 8 | 2020-12-30 00:46:50|
| 2 | 2020-01-16 02:49:50|
๐ก Note:
User 6 logged in twice in 2020 (June 30) and other years, so we take their latest 2020 login. User 8 had two 2020 logins, we take December 30. User 2 had one 2020 login in January. All other login years are ignored.
example_2.sql โ Single Login Per User
$
Input:
Logins table:
| user_id | time_stamp |
|---------|--------------------|
| 1 | 2020-03-15 10:30:00|
| 2 | 2020-07-22 14:15:30|
| 3 | 2020-11-05 09:45:12|
| 1 | 2019-12-01 08:00:00|
| 2 | 2021-01-10 16:30:00|
โบ
Output:
| user_id | last_login |
|---------|--------------------|
| 1 | 2020-03-15 10:30:00|
| 2 | 2020-07-22 14:15:30|
| 3 | 2020-11-05 09:45:12|
๐ก Note:
Each user has exactly one login in 2020, so those are their latest (and only) 2020 logins. User 1's 2019 login and User 2's 2021 login are excluded.
example_3.sql โ No 2020 Logins Edge Case
$
Input:
Logins table:
| user_id | time_stamp |
|---------|--------------------|
| 1 | 2019-12-31 23:59:59|
| 1 | 2021-01-01 00:00:01|
| 2 | 2018-05-15 12:30:00|
| 2 | 2022-03-10 14:20:15|
โบ
Output:
| user_id | last_login |
|---------|------------|
| | |
๐ก Note:
No users logged in during 2020, so the result set is empty. User 1 logged in on Dec 31, 2019 and Jan 1, 2021, but neither is in 2020. User 2 logged in 2018 and 2022, also outside our target year.
Constraints
- 1 โค Number of login records โค 104
- Each (user_id, time_stamp) combination is unique
- time_stamp is a valid datetime value
- user_id is a positive integer
Visualization
Tap to expand
Understanding the Visualization
1
Filter by Year
Remove all login records that aren't from 2020, like filtering mail by postmark year
2
Group by User
Organize remaining records by user_id, like sorting mail into piles by recipient
3
Find Latest in Each Group
For each user's pile, pick the record with the maximum timestamp
4
Return Results
Output each user with their latest 2020 login timestamp
Key Takeaway
๐ฏ Key Insight: GROUP BY with MAX() leverages database optimization to process each record exactly once, eliminating the need for nested queries or multiple passes through the data.
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code