Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python – Drop multiple levels from a multi-level column index in Pandas dataframe
When working with Pandas DataFrames that have multi-level column indexes, you can drop multiple levels using the columns.droplevel() method repeatedly. This is useful when you need to simplify complex hierarchical column structures.
Creating a Multi-Level Column Index
First, let's create a DataFrame with a three-level column index using MultiIndex.from_tuples() ?
import numpy as np
import pandas as pd
# Create multi-level column index
items = pd.MultiIndex.from_tuples([("Col 1", "Col 1", "Col 1"),
("Col 2", "Col 2", "Col 2"),
("Col 3", "Col 3", "Col 3")])
# Create multi-level row index
arr = [np.array(['car', 'car', 'car', 'bike', 'bike', 'bike', 'truck', 'truck', 'truck']),
np.array(['valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC'])]
# Create DataFrame with multi-level indexes
dataFrame = pd.DataFrame(np.random.randn(9, 3), index=arr, columns=items)
dataFrame.index.names = ['vehicle', 'type']
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
Col 1 Col 2 Col 3
Col 1 Col 2 Col 3
Col 1 Col 2 Col 3
vehicle type
car valueA 0.425077 0.020606 1.148156
valueB -1.720355 0.502863 1.184753
valueC 0.373106 1.300935 -0.128404
bike valueA -0.648708 0.944725 0.593327
valueB -0.613921 -0.238730 -0.218448
valueC 0.313042 -0.628065 0.910935
truck valueA 0.286377 0.478067 -1.000645
valueB 1.151793 -0.171433 -0.612346
valueC -1.358061 0.735075 0.092700
Dropping the First Level
Use droplevel(0) to remove the topmost level from the column index ?
import numpy as np
import pandas as pd
# Create the same DataFrame as before
items = pd.MultiIndex.from_tuples([("Col 1", "Col 1", "Col 1"),
("Col 2", "Col 2", "Col 2"),
("Col 3", "Col 3", "Col 3")])
arr = [np.array(['car', 'car', 'car', 'bike', 'bike', 'bike', 'truck', 'truck', 'truck']),
np.array(['valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC'])]
dataFrame = pd.DataFrame(np.random.randn(9, 3), index=arr, columns=items)
dataFrame.index.names = ['vehicle', 'type']
# Drop the first level
dataFrame.columns = dataFrame.columns.droplevel(0)
print("After dropping first level:")
print(dataFrame)
After dropping first level:
Col 1 Col 2 Col 3
Col 1 Col 2 Col 3
vehicle type
car valueA 0.425077 0.020606 1.148156
valueB -1.720355 0.502863 1.184753
valueC 0.373106 1.300935 -0.128404
bike valueA -0.648708 0.944725 0.593327
valueB -0.613921 -0.238730 -0.218448
valueC 0.313042 -0.628065 0.910935
truck valueA 0.286377 0.478067 -1.000645
valueB 1.151793 -0.171433 -0.612346
valueC -1.358061 0.735075 0.092700
Dropping Multiple Levels
After dropping one level, the remaining levels shift down. Drop another level to get a single-level column index ?
import numpy as np
import pandas as pd
# Create the same DataFrame
items = pd.MultiIndex.from_tuples([("Col 1", "Col 1", "Col 1"),
("Col 2", "Col 2", "Col 2"),
("Col 3", "Col 3", "Col 3")])
arr = [np.array(['car', 'car', 'car', 'bike', 'bike', 'bike', 'truck', 'truck', 'truck']),
np.array(['valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC', 'valueA', 'valueB', 'valueC'])]
dataFrame = pd.DataFrame(np.random.randn(9, 3), index=arr, columns=items)
dataFrame.index.names = ['vehicle', 'type']
# Drop first level
dataFrame.columns = dataFrame.columns.droplevel(0)
# Drop second level (now at index 0)
dataFrame.columns = dataFrame.columns.droplevel(0)
print("After dropping two levels:")
print(dataFrame)
After dropping two levels:
Col 1 Col 2 Col 3
vehicle type
car valueA 0.425077 0.020606 1.148156
valueB -1.720355 0.502863 1.184753
valueC 0.373106 1.300935 -0.128404
bike valueA -0.648708 0.944725 0.593327
valueB -0.613921 -0.238730 -0.218448
valueC 0.313042 -0.628065 0.910935
truck valueA 0.286377 0.478067 -1.000645
valueB 1.151793 -0.171433 -0.612346
valueC -1.358061 0.735075 0.092700
Alternative: Drop Specific Level by Position
You can also drop a specific level by its position without dropping sequentially ?
import numpy as np
import pandas as pd
# Create DataFrame with 3-level columns
items = pd.MultiIndex.from_tuples([("Level0", "Level1", "Col1"),
("Level0", "Level1", "Col2"),
("Level0", "Level1", "Col3")])
data = [[1, 2, 3], [4, 5, 6]]
df = pd.DataFrame(data, columns=items)
print("Original:")
print(df)
# Drop level 1 (middle level)
df.columns = df.columns.droplevel(1)
print("\nAfter dropping level 1:")
print(df)
Original:
Level0 Level0 Level0
Level1 Level1 Level1
Col1 Col2 Col3
0 1 2 3
1 4 5 6
After dropping level 1:
Level0 Level0 Level0
Col1 Col2 Col3
0 1 2 3
1 4 5 6
Conclusion
Use columns.droplevel() repeatedly to remove multiple levels from multi-level column indexes. Remember that after each drop, the remaining levels shift positions, so always drop from index 0 for sequential removal.
