Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python Pandas – Remove leading and trailing whitespace from more than one column
To remove leading and trailing whitespace from multiple columns in a Pandas DataFrame, use the str.strip() method. This is useful for cleaning data that contains unwanted spaces.
Creating a DataFrame with Whitespace
First, let's create a DataFrame with whitespace in the string columns ?
import pandas as pd
# Create a DataFrame with whitespace in string columns
dataFrame = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances', ' Furniture', 'Stationery'],
'Product Name': ['Keyboard', 'Charger', ' SmartTV', 'Refrigerators', ' Chairs', 'Diaries'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
print("Original DataFrame:")
print(dataFrame)
Original DataFrame: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Appliances Refrigerators 20 4 Furniture Chairs 25 5 Stationery Diaries 50
Method 1: Using str.strip() on Individual Columns
Apply str.strip() to each column separately and assign back to the DataFrame ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances', ' Furniture', 'Stationery'],
'Product Name': ['Keyboard', 'Charger', ' SmartTV', 'Refrigerators', ' Chairs', 'Diaries'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Remove whitespace from specific columns
dataFrame['Product Category'] = dataFrame['Product Category'].str.strip()
dataFrame['Product Name'] = dataFrame['Product Name'].str.strip()
print("DataFrame after removing whitespaces:")
print(dataFrame)
DataFrame after removing whitespaces: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Appliances Refrigerators 20 4 Furniture Chairs 25 5 Stationery Diaries 50
Method 2: Using apply() for Multiple Columns
Use apply() to strip whitespace from multiple string columns at once ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances', ' Furniture', 'Stationery'],
'Product Name': ['Keyboard', 'Charger', ' SmartTV', 'Refrigerators', ' Chairs', 'Diaries'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Strip whitespace from multiple columns using apply
string_columns = ['Product Category', 'Product Name']
dataFrame[string_columns] = dataFrame[string_columns].apply(lambda x: x.str.strip())
print("DataFrame after removing whitespaces:")
print(dataFrame)
DataFrame after removing whitespaces: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Appliances Refrigerators 20 4 Furniture Chairs 25 5 Stationery Diaries 50
Method 3: Using applymap() for All String Columns
For DataFrames with only string columns, use applymap() to apply strip to all cells ?
import pandas as pd
# DataFrame with only string columns
string_df = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances'],
'Product Name': ['Keyboard ', 'Charger', ' SmartTV', 'Refrigerators ']
})
print("Original string DataFrame:")
print(string_df)
# Apply strip to all cells
cleaned_df = string_df.applymap(lambda x: x.strip() if isinstance(x, str) else x)
print("\nAfter removing whitespaces:")
print(cleaned_df)
Original string DataFrame: Product Category Product Name 0 Computer Keyboard 1 Mobile Phone Charger 2 Electronics SmartTV 3 Appliances Refrigerators After removing whitespaces: Product Category Product Name 0 Computer Keyboard 1 Mobile Phone Charger 2 Electronics SmartTV 3 Appliances Refrigerators
Comparison of Methods
| Method | Best For | Syntax |
|---|---|---|
| Individual columns | Few specific columns | df['col'].str.strip() |
| apply() with lambda | Multiple selected columns | df[cols].apply(lambda x: x.str.strip()) |
| applymap() | All string columns | df.applymap(lambda x: x.strip()) |
Conclusion
Use str.strip() on individual columns for specific cleaning. For multiple columns, apply() with lambda provides an efficient solution. Choose applymap() when all columns contain strings that need cleaning.
Advertisements
