Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python – Stacking a single-level column with Pandas stack()?
The Pandas stack() method transforms a DataFrame by stacking column levels into row levels, creating a hierarchical index. This operation pivots columns into a multi-level index, converting wide data to long format.
Syntax
DataFrame.stack(level=-1, dropna=True)
Creating a DataFrame with Single-Level Columns
First, let's create a simple DataFrame with single-level columns ?
import pandas as pd
# Create DataFrame with single-level columns
dataFrame = pd.DataFrame([[10, 15], [20, 25], [30, 35], [40, 45]],
index=['w', 'x', 'y', 'z'],
columns=['a', 'b'])
print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
a b
w 10 15
x 20 25
y 30 35
z 40 45
Stacking the DataFrame
The stack() method converts columns to rows, creating a Series with a multi-level index ?
import pandas as pd
# Create DataFrame
dataFrame = pd.DataFrame([[10, 15], [20, 25], [30, 35], [40, 45]],
index=['w', 'x', 'y', 'z'],
columns=['a', 'b'])
# Stack the DataFrame
stacked = dataFrame.stack()
print("Stacked DataFrame:")
print(stacked)
print(f"\nType: {type(stacked)}")
Stacked DataFrame: w a 10 b 15 x a 20 b 25 y a 30 b 35 z a 40 b 45 dtype: int64 Type: <class 'pandas.core.series.Series'>
Understanding the Result
The stacked result has a two-level index where the first level is the original row index and the second level is the original column names ?
import pandas as pd
dataFrame = pd.DataFrame([[10, 15], [20, 25], [30, 35], [40, 45]],
index=['w', 'x', 'y', 'z'],
columns=['a', 'b'])
stacked = dataFrame.stack()
# Access the multi-level index
print("Index levels:")
print(f"Level 0 (rows): {stacked.index.get_level_values(0).tolist()}")
print(f"Level 1 (cols): {stacked.index.get_level_values(1).tolist()}")
# Access specific values
print(f"\nValue at (w, a): {stacked['w', 'a']}")
print(f"Value at (x, b): {stacked['x', 'b']}")
Index levels: Level 0 (rows): ['w', 'w', 'x', 'x', 'y', 'y', 'z', 'z'] Level 1 (cols): ['a', 'b', 'a', 'b', 'a', 'b', 'a', 'b'] Value at (w, a): 10 Value at (x, b): 25
Key Points
- The
stack()method returns a Series with a hierarchical index - Original column names become the second level of the new index
- The operation is the inverse of
unstack() - By default,
dropna=Trueremoves missing values
Conclusion
The stack() method is essential for reshaping data from wide to long format. It creates a hierarchical index by moving column labels to row levels, making it useful for data analysis and visualization tasks that require long-format data.
