Find Invalid IP Addresses - Problem

You're working as a security analyst reviewing server logs to identify potentially malicious traffic. One crucial step is validating IP addresses in the access logs to ensure data quality and security.

Given a table logs containing server access information, you need to identify all invalid IPv4 addresses and count how many times each appears.

An IPv4 address is considered invalid if it meets ANY of these conditions:

  • Contains numbers greater than 255 in any octet (e.g., 192.168.1.300)
  • Has leading zeros in any octet (e.g., 192.168.01.1)
  • Has less or more than 4 octets (e.g., 192.168.1 or 192.168.1.1.1)

Your task is to return all invalid IP addresses along with their occurrence count, ordered by invalid_count and ip in descending order.

Input & Output

example_1.sql โ€” Basic Invalid IPs
$ Input: logs table: +--------+---------------+-------------+ | log_id | ip | status_code | +--------+---------------+-------------+ | 1 | 192.168.1.300 | 200 | | 2 | 192.168.01.1 | 404 | | 3 | 192.168.1.1 | 200 | | 4 | 192.168.1.300 | 500 | +--------+---------------+-------------+
โ€บ Output: +---------------+---------------+ | ip | invalid_count | +---------------+---------------+ | 192.168.1.300 | 2 | | 192.168.01.1 | 1 | +---------------+---------------+
๐Ÿ’ก Note: 192.168.1.300 appears twice and is invalid (300 > 255). 192.168.01.1 appears once and is invalid (leading zero in '01'). 192.168.1.1 is valid and not included.
example_2.sql โ€” Various Invalid Patterns
$ Input: logs table: +--------+-----------------+-------------+ | log_id | ip | status_code | +--------+-----------------+-------------+ | 1 | 192.168.1 | 200 | | 2 | 192.168.1.1.1 | 404 | | 3 | 192.168..1 | 403 | | 4 | 192.168.1 | 500 | +--------+-----------------+-------------+
โ€บ Output: +---------------+---------------+ | ip | invalid_count | +---------------+---------------+ | 192.168.1 | 2 | | 192.168.1.1.1 | 1 | | 192.168..1 | 1 | +---------------+---------------+
๐Ÿ’ก Note: 192.168.1 has only 3 octets (appears 2 times). 192.168.1.1.1 has 5 octets. 192.168..1 has empty octet between dots.
example_3.sql โ€” Edge Cases
$ Input: logs table: +--------+---------------+-------------+ | log_id | ip | status_code | +--------+---------------+-------------+ | 1 | 000.000.000.000| 200 | | 2 | 255.255.255.255| 404 | | 3 | 256.1.1.1 | 403 | +--------+---------------+-------------+
โ€บ Output: +---------------+---------------+ | ip | invalid_count | +---------------+---------------+ | 256.1.1.1 | 1 | | 000.000.000.000| 1 | +---------------+---------------+
๐Ÿ’ก Note: 000.000.000.000 is invalid due to leading zeros. 255.255.255.255 is valid (boundary case). 256.1.1.1 is invalid (256 > 255).

Constraints

  • 1 โ‰ค number of log entries โ‰ค 105
  • 1 โ‰ค log_id โ‰ค 109
  • 1 โ‰ค ip.length โ‰ค 15
  • 100 โ‰ค status_code โ‰ค 599
  • IP addresses may contain any characters, not just valid IPv4 format

Visualization

Tap to expand
IP Address Validation PipelineLog EntriesOctet Countโ‰  4?Leading ZeroCheckRange Check> 255?Counter& Sort192.168.1.300INVALID192.168.01.1INVALID192.168.1.1VALIDFinal Result (Sorted by count DESC, IP DESC):192.168.1.300 (count: 2), 192.168.01.1 (count: 1)
Understanding the Visualization
1
Input Stream
Log entries flow into the validation pipeline
2
Pattern Gates
Each IP passes through validation checkpoints
3
Classification
IPs are classified as valid or invalid
4
Counting
Invalid IPs are counted and sorted
Key Takeaway
๐ŸŽฏ Key Insight: SQL pattern matching allows parallel validation of all rules in O(n) time, making it far more efficient than sequential character-by-character parsing.
Asked in
Google 23 Amazon 31 Meta 18 Microsoft 15
43.6K Views
High Frequency
~25 min Avg. Time
1.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen