Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python - Drop specific rows from multiindex Pandas Dataframe
To drop specific rows from a multiindex DataFrame, use the drop() method. This method allows you to remove rows by specifying the index values as tuples for multiindex structures.
Creating a MultiIndex DataFrame
First, let's create a multiindex DataFrame with hierarchical index levels ?
import numpy as np
import pandas as pd
# Create multiindex array
arr = [np.array(['car', 'car', 'car', 'bike', 'bike', 'bike', 'truck', 'truck', 'truck']),
np.array(['valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC'])]
# Create multiindex dataframe
dataFrame = pd.DataFrame(
np.random.randn(9, 3), index=arr, columns=['Col 1', 'Col 2', 'Col 3'])
dataFrame.index.names = ['level 0', 'level 1']
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
Col 1 Col 2 Col 3
level 0 level 1
car valueA -0.123456 0.789012 -0.345678
valueB 0.567890 -0.234567 0.891234
valueC -0.678901 0.456789 -0.012345
bike valueA 0.234567 -0.567890 0.789012
valueB -0.345678 0.123456 -0.890123
valueC 0.456789 -0.789012 0.234567
truck valueA -0.567890 0.345678 -0.123456
valueB 0.789012 -0.456789 0.567890
valueC -0.012345 0.678901 -0.789012
Dropping Specific Rows
To drop a specific row from a multiindex DataFrame, pass the index values as a tuple to the drop() method ?
import numpy as np
import pandas as pd
# Create multiindex array
arr = [np.array(['car', 'car', 'car', 'bike', 'bike', 'bike', 'truck', 'truck', 'truck']),
np.array(['valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC'])]
# Create multiindex dataframe
dataFrame = pd.DataFrame(
np.random.randn(9, 3), index=arr, columns=['Col 1', 'Col 2', 'Col 3'])
dataFrame.index.names = ['level 0', 'level 1']
print("Before dropping:")
print(dataFrame)
# Drop specific row using tuple (level0_value, level1_value)
dataFrame = dataFrame.drop(('car', 'valueA'), axis=0)
print("\nAfter dropping ('car', 'valueA'):")
print(dataFrame)
Before dropping:
Col 1 Col 2 Col 3
level 0 level 1
car valueA -0.123456 0.789012 -0.345678
valueB 0.567890 -0.234567 0.891234
valueC -0.678901 0.456789 -0.012345
bike valueA 0.234567 -0.567890 0.789012
valueB -0.345678 0.123456 -0.890123
valueC 0.456789 -0.789012 0.234567
truck valueA -0.567890 0.345678 -0.123456
valueB 0.789012 -0.456789 0.567890
valueC -0.012345 0.678901 -0.789012
After dropping ('car', 'valueA'):
Col 1 Col 2 Col 3
level 0 level 1
car valueB 0.567890 -0.234567 0.891234
valueC -0.678901 0.456789 -0.012345
bike valueA 0.234567 -0.567890 0.789012
valueB -0.345678 0.123456 -0.890123
valueC 0.456789 -0.789012 0.234567
truck valueA -0.567890 0.345678 -0.123456
valueB 0.789012 -0.456789 0.567890
valueC -0.012345 0.678901 -0.789012
Dropping Multiple Rows
You can also drop multiple rows at once by passing a list of tuples ?
import numpy as np
import pandas as pd
# Create multiindex array
arr = [np.array(['car', 'car', 'car', 'bike', 'bike', 'bike', 'truck', 'truck', 'truck']),
np.array(['valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC'])]
# Create multiindex dataframe
dataFrame = pd.DataFrame(
np.random.randn(9, 3), index=arr, columns=['Col 1', 'Col 2', 'Col 3'])
dataFrame.index.names = ['level 0', 'level 1']
# Drop multiple rows
rows_to_drop = [('car', 'valueA'), ('bike', 'valueB')]
dataFrame = dataFrame.drop(rows_to_drop, axis=0)
print("After dropping multiple rows:")
print(dataFrame)
After dropping multiple rows:
Col 1 Col 2 Col 3
level 0 level 1
car valueB 0.567890 -0.234567 0.891234
valueC -0.678901 0.456789 -0.012345
bike valueA 0.234567 -0.567890 0.789012
valueC 0.456789 -0.789012 0.234567
truck valueA -0.567890 0.345678 -0.123456
valueB 0.789012 -0.456789 0.567890
valueC -0.012345 0.678901 -0.789012
Key Points
- Use tuples to specify multiindex row coordinates:
(level0_value, level1_value) - Set
axis=0to drop rows (default behavior) - Use
inplace=Trueto modify the original DataFrame, or assign the result to a new variable - Pass a list of tuples to drop multiple rows simultaneously
Conclusion
The drop() method efficiently removes specific rows from multiindex DataFrames using tuple notation. Specify index values as tuples and use lists for multiple row deletions.
Advertisements
