Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python - How to drop the null rows from a Pandas DataFrame
To drop null (NaN) rows from a Pandas DataFrame, use the dropna() method. This method removes any row containing at least one null value by default.
Creating a DataFrame with Null Values
Let's create a sample DataFrame with some null values to demonstrate ?
import pandas as pd
import numpy as np
# Create sample data with NaN values
data = {
'Car': ['Audi', 'Porsche', 'RollsRoyce', 'BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Lamborghini'],
'Place': ['Bangalore', 'Mumbai', 'Pune', 'Delhi', 'Hyderabad', 'Chandigarh', 'Mumbai', 'Pune', 'Delhi'],
'UnitsSold': [80.0, 110.0, np.nan, 200.0, 80.0, np.nan, np.nan, 120.0, 100.0]
}
dataFrame = pd.DataFrame(data)
print("Original DataFrame:")
print(dataFrame)
print(f"\nShape: {dataFrame.shape}")
Original DataFrame:
Car Place UnitsSold
0 Audi Bangalore 80.0
1 Porsche Mumbai 110.0
2 RollsRoyce Pune NaN
3 BMW Delhi 200.0
4 Mercedes Hyderabad 80.0
5 Lamborghini Chandigarh NaN
6 Audi Mumbai NaN
7 Mercedes Pune 120.0
8 Lamborghini Delhi 100.0
Shape: (9, 3)
Using dropna() to Remove Null Rows
The dropna() method removes all rows containing any null values ?
import pandas as pd
import numpy as np
# Create sample data with NaN values
data = {
'Car': ['Audi', 'Porsche', 'RollsRoyce', 'BMW', 'Mercedes', 'Lamborghini', 'Audi', 'Mercedes', 'Lamborghini'],
'Place': ['Bangalore', 'Mumbai', 'Pune', 'Delhi', 'Hyderabad', 'Chandigarh', 'Mumbai', 'Pune', 'Delhi'],
'UnitsSold': [80.0, 110.0, np.nan, 200.0, 80.0, np.nan, np.nan, 120.0, 100.0]
}
dataFrame = pd.DataFrame(data)
# Remove rows with null values
clean_dataFrame = dataFrame.dropna()
print("DataFrame after removing null values:")
print(clean_dataFrame)
print(f"\nUpdated Shape: {clean_dataFrame.shape}")
DataFrame after removing null values:
Car Place UnitsSold
0 Audi Bangalore 80.0
1 Porsche Mumbai 110.0
3 BMW Delhi 200.0
4 Mercedes Hyderabad 80.0
7 Mercedes Pune 120.0
8 Lamborghini Delhi 100.0
Updated Shape: (6, 3)
dropna() Parameters
The dropna() method has several useful parameters ?
import pandas as pd
import numpy as np
# Create DataFrame with multiple null columns
data = {
'A': [1, 2, np.nan, 4],
'B': [np.nan, 2, 3, 4],
'C': [1, np.nan, np.nan, 4]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Drop rows where ALL values are null (how='all')
print("\nUsing how='all':")
print(df.dropna(how='all'))
# Drop rows based on specific subset of columns
print("\nUsing subset=['A', 'B']:")
print(df.dropna(subset=['A', 'B']))
Original DataFrame:
A B C
0 1.0 NaN 1.0
1 2.0 2.0 NaN
2 NaN 3.0 NaN
3 4.0 4.0 4.0
Using how='all':
A B C
0 1.0 NaN 1.0
1 2.0 2.0 NaN
2 NaN 3.0 NaN
3 4.0 4.0 4.0
Using subset=['A', 'B']:
A B C
1 2.0 2.0 NaN
2 NaN 3.0 NaN
3 4.0 4.0 4.0
Key Parameters
| Parameter | Description | Default |
|---|---|---|
how |
'any' removes rows with any null, 'all' removes rows where all values are null | 'any' |
subset |
List of columns to consider for null values | All columns |
inplace |
Modify original DataFrame if True | False |
Conclusion
Use dropna() to remove rows with null values from a DataFrame. Use parameters like how='all' or subset to control which rows are dropped based on your specific needs.
Advertisements
