Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python Pandas - Drop the value when any level is NaN in a Multi-index
To drop rows when any level contains NaN values in a Multi-index, use the dropna() method with the parameter how='any'. This removes all tuples that have at least one NaN value at any level.
Creating a Multi-index with NaN Values
First, let's create a multi-index containing some NaN values ?
import pandas as pd
import numpy as np
# Create a multi-index with some NaN values
# The names parameter sets the names for the levels in the index
multiIndex = pd.MultiIndex.from_arrays(
[[5, 10], [np.nan, 20], [25, np.nan], [35, 40]],
names=['a', 'b', 'c', 'd']
)
print("Original Multi-index:")
print(multiIndex)
Original Multi-index:
MultiIndex([( 5, nan, 25.0, 35),
(10, 20.0, nan, 40)],
names=['a', 'b', 'c', 'd'])
Dropping Values with Any NaN Level
Use dropna(how='any') to remove all tuples containing at least one NaN value ?
import pandas as pd
import numpy as np
multiIndex = pd.MultiIndex.from_arrays(
[[5, 10], [np.nan, 20], [25, np.nan], [35, 40]],
names=['a', 'b', 'c', 'd']
)
# Drop rows where any level has NaN
cleaned_index = multiIndex.dropna(how='any')
print("After dropping rows with any NaN:")
print(cleaned_index)
print(f"\nOriginal length: {len(multiIndex)}")
print(f"After dropna: {len(cleaned_index)}")
After dropping rows with any NaN: MultiIndex([], names=['a', 'b', 'c', 'd']) Original length: 2 After dropna: 0
Alternative: Dropping Only When All Levels are NaN
You can also use how='all' to drop only when all levels in a tuple are NaN ?
import pandas as pd
import numpy as np
# Create a multi-index with a row where all levels are NaN
multiIndex = pd.MultiIndex.from_arrays(
[[5, np.nan, 10], [np.nan, np.nan, 20], [25, np.nan, 30], [35, np.nan, 40]],
names=['a', 'b', 'c', 'd']
)
print("Original Multi-index:")
print(multiIndex)
print("\nDropping with how='all':")
print(multiIndex.dropna(how='all'))
print("\nDropping with how='any':")
print(multiIndex.dropna(how='any'))
Original Multi-index:
MultiIndex([( 5.0, nan, 25.0, 35.0),
( nan, nan, nan, nan),
(10.0, 20.0, 30.0, 40.0)],
names=['a', 'b', 'c', 'd'])
Dropping with how='all':
MultiIndex([( 5.0, nan, 25.0, 35.0),
(10.0, 20.0, 30.0, 40.0)],
names=['a', 'b', 'c', 'd'])
Dropping with how='any':
MultiIndex([(10.0, 20.0, 30.0, 40.0)],
names=['a', 'b', 'c', 'd'])
Conclusion
Use dropna(how='any') to remove multi-index tuples with any NaN values, or how='all' to remove only tuples where all levels are NaN. The 'any' option is stricter and removes more data.
Advertisements
