Fix Product Name Format - Problem

๐Ÿ›๏ธ Sales Data Cleanup Challenge

You're working as a data analyst for an e-commerce company, and you've discovered that the sales database from the year 2000 is a complete mess! ๐Ÿ˜ฑ

The Sales table contains product sales data, but since it was filled manually, the product names are inconsistent:

  • Some have extra spaces at the beginning or end
  • Some are in UPPERCASE, others in lowercase, and some are MiXeD cAsE
  • You need to clean this data and generate a monthly sales report

Your mission: Transform messy product names into a clean, standardized format and count how many times each product was sold per month.

Column NameType
sale_idint
product_namevarchar
sale_datedate

Output Requirements:

  • Product names should be lowercase and trimmed of whitespace
  • Dates should be formatted as 'YYYY-MM'
  • Include the total count of sales for each product per month
  • Results ordered by product_name ASC, then sale_date ASC

Input & Output

example_1.sql โ€” Basic Product Name Cleanup
$ Input: Sales table: +----------+----------------+------------+ | sale_id | product_name | sale_date | +----------+----------------+------------+ | 1 | ' iPhone ' | 2000-01-01 | | 2 | 'IPHONE' | 2000-01-15 | | 3 | ' Samsung ' | 2000-02-01 | | 4 | 'iphone' | 2000-01-20 | | 5 | 'SAMSUNG' | 2000-02-10 | +----------+----------------+------------+
โ€บ Output: +----------------+------------+-------+ | product_name | sale_date | total | +----------------+------------+-------+ | 'iphone' | '2000-01' | 3 | | 'samsung' | '2000-02' | 2 | +----------------+------------+-------+
๐Ÿ’ก Note: Product names are cleaned (lowercased and trimmed), dates are formatted to YYYY-MM, and sales are counted per product per month. 'iPhone' variants (3 sales) are grouped together, and 'Samsung' variants (2 sales) are grouped together.
example_2.sql โ€” Multiple Months Same Product
$ Input: Sales table: +----------+----------------+------------+ | sale_id | product_name | sale_date | +----------+----------------+------------+ | 1 | ' MacBook ' | 2000-01-05 | | 2 | 'MACBOOK' | 2000-01-15 | | 3 | ' macbook' | 2000-02-01 | | 4 | 'MacBook Pro' | 2000-01-20 | +----------+----------------+------------+
โ€บ Output: +----------------+------------+-------+ | product_name | sale_date | total | +----------------+------------+-------+ | 'macbook' | '2000-01' | 2 | | 'macbook' | '2000-02' | 1 | | 'macbook pro' | '2000-01' | 1 | +----------------+------------+-------+
๐Ÿ’ก Note: Same product 'macbook' appears in different months, so it gets separate entries. 'MacBook Pro' is treated as a different product. Results are ordered by product name first, then by date.
example_3.sql โ€” Edge Case with Extreme Whitespace
$ Input: Sales table: +----------+-------------------------+------------+ | sale_id | product_name | sale_date | +----------+-------------------------+------------+ | 1 | '\t\t XBOX \n ' | 2000-03-01 | | 2 | ' Xbox ' | 2000-03-15 | | 3 | ' ' | 2000-03-20 | | 4 | 'PlayStation' | 2000-03-01 | +----------+-------------------------+------------+
โ€บ Output: +----------------+------------+-------+ | product_name | sale_date | total | +----------------+------------+-------+ | '' | '2000-03' | 1 | | 'playstation' | '2000-03' | 1 | | 'xbox' | '2000-03' | 2 | +----------------+------------+-------+
๐Ÿ’ก Note: Extreme whitespace (tabs, newlines, multiple spaces) is handled by TRIM function. Empty string after trimming is treated as a valid product name. Xbox variants are properly grouped together.

Visualization

Tap to expand
๐Ÿญ Data Cleaning Assembly LineRaw Input' iPhone ''SAMSUNG'LOWER()TRIM()'iphone'DATE_FORMAT2000-01-15โ†“'2000-01'GROUP BY+ COUNT(*)ORDER BYname, date๐Ÿ“Š Final Clean Reportiphone | 2000-01 | 3iphone | 2000-02 | 1samsung | 2000-01 | 2โœจ Clean, standardized, and properly counted! โœจ๐Ÿš€ Single SQL Query = Maximum Efficiency!Database engine handles all optimizations automatically
Understanding the Visualization
1
Input Raw Data
Messy sales records with inconsistent product names and full dates
2
Clean Product Names
Apply LOWER() and TRIM() to standardize all product names
3
Extract Month
Format dates to YYYY-MM for monthly grouping
4
Group & Count
Use GROUP BY to aggregate sales counts per product per month
5
Sort Results
ORDER BY product name, then date for consistent output
Key Takeaway
๐ŸŽฏ Key Insight: SQL's GROUP BY with aggregate functions transforms messy data into clean reports in a single optimized query, leveraging database engine optimizations for maximum performance.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n log n)

Database engine optimizes grouping and sorting, typically O(n log n) for sorting

n
2n
โšก Linearithmic
Space Complexity
O(k)

Where k is the number of unique product-month combinations (much smaller than n)

n
2n
โœ“ Linear Space

Constraints

  • 1 โ‰ค number of sales records โ‰ค 105
  • 1 โ‰ค product_name.length โ‰ค 100
  • product_name may contain leading/trailing whitespace and mixed case
  • sale_date is in valid YYYY-MM-DD format
  • All sale_date values are from the year 2000
Asked in
Amazon 45 Microsoft 38 Google 32 Meta 28
42.3K Views
High Frequency
~15 min Avg. Time
1.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen