Fill Missing Data - Problem

You are working with a DataFrame containing product information from an e-commerce inventory system. The DataFrame has three columns: name (product name), quantity (stock quantity), and price (product price).

Due to data collection issues, some entries in the quantity column are missing (represented as NaN or None values). Your task is to clean the data by filling all missing values in the quantity column with 0.

This is a common data preprocessing step in machine learning and data analysis workflows.

Column NameType
nameobject
quantityint
priceint

Goal: Return the DataFrame with all missing quantity values replaced with 0.

Input & Output

example_1.py โ€” Basic Missing Values
$ Input: products = pd.DataFrame({ 'name': ['Apple', 'Banana', 'Orange'], 'quantity': [10, None, 5], 'price': [1, 2, 3] })
โ€บ Output: name quantity price 0 Apple 10 1 1 Banana 0 2 2 Orange 5 3
๐Ÿ’ก Note: The missing value (None) in the second row's quantity column is replaced with 0, while other values remain unchanged.
example_2.py โ€” Multiple Missing Values
$ Input: products = pd.DataFrame({ 'name': ['Apple', 'Banana', 'Orange', 'Mango'], 'quantity': [10, np.nan, 5, None], 'price': [1, 2, 3, 4] })
โ€บ Output: name quantity price 0 Apple 10 1 1 Banana 0 2 2 Orange 5 3 3 Mango 0 4
๐Ÿ’ก Note: Both np.nan and None values in the quantity column are replaced with 0. The function handles different types of missing value representations.
example_3.py โ€” No Missing Values
$ Input: products = pd.DataFrame({ 'name': ['Apple', 'Banana'], 'quantity': [10, 15], 'price': [1, 2] })
โ€บ Output: name quantity price 0 Apple 10 1 1 Banana 15 2
๐Ÿ’ก Note: When there are no missing values in the quantity column, the DataFrame remains unchanged. This demonstrates the method works correctly for clean data.

Visualization

Tap to expand
Data Cleaning: Fill Missing Values PipelineRaw DataMissingValuesfillna(0)ProcessingVectorizedOperationClean DataCompleteDatasetBeforeApple10$1BananaNaN$2Orange5$3MangoNone$4โŒ Missing ValuesAfterApple10$1Banana0$2Orange5$3Mango0$4โœ… Complete Dataproducts['quantity'].fillna(0)
Understanding the Visualization
1
Identify Missing Data
Scan the quantity column to locate NaN/None values
2
Apply fillna() Method
Use pandas vectorized operation to replace all missing values
3
Verify Data Types
Ensure quantity column maintains integer type after filling
4
Return Clean DataFrame
Output DataFrame with consistent, complete data
Key Takeaway
๐ŸŽฏ Key Insight: Using pandas' `fillna()` method is the most efficient approach because it performs vectorized operations at the C level, making it significantly faster than manual iteration for data cleaning tasks.

Time & Space Complexity

Time Complexity
โฑ๏ธ
O(n)

Linear scan through the column, but highly optimized in C

n
2n
โœ“ Linear Growth
Space Complexity
O(1)

Can modify in place or create copy efficiently

n
2n
โœ“ Linear Space

Constraints

  • The DataFrame will always have exactly 3 columns: name, quantity, and price
  • 1 โ‰ค number of rows โ‰ค 105
  • Missing values in quantity column can be None, NaN, or other pandas null representations
  • Only the quantity column should be modified - other columns remain unchanged
Asked in
Google 45 Amazon 38 Meta 32 Microsoft 28
127.5K Views
Very High Frequency
~5 min Avg. Time
2.8K Likes
Ln 1, Col 1
Smart Actions
๐Ÿ’ก Explanation
AI Ready
๐Ÿ’ก Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen