Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Create a Pipeline and remove a column from DataFrame - Python Pandas
Use the ColDrop() method of pdpipe library to remove a column from Pandas DataFrame. The pdpipe library provides a pipeline-based approach for data preprocessing operations.
Installing pdpipe
First, install the pdpipe library ?
pip install pdpipe
Importing Required Libraries
Import the required pdpipe and pandas libraries with their respective aliases ?
import pdpipe as pdp import pandas as pd
Creating a DataFrame
Let us create a DataFrame with car data. Here, we have two columns ?
import pandas as pd
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
print("DataFrame...")
print(dataFrame)
DataFrame...
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
3 Mustang 80
4 Bentley 110
5 Jaguar 90
Removing a Column Using ColDrop()
To remove a column from the DataFrame, use the ColDrop() method. Here, we are removing the "Units" column ?
import pdpipe as pdp
import pandas as pd
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
# Remove column using pdpipe
resDF = pdp.ColDrop("Units").apply(dataFrame)
print("DataFrame after removing 'Units' column...")
print(resDF)
DataFrame after removing 'Units' column...
Car
0 BMW
1 Lexus
2 Audi
3 Mustang
4 Bentley
5 Jaguar
Complete Pipeline Example
Here's a complete example showing both row and column removal operations ?
import pdpipe as pdp
import pandas as pd
# Create DataFrame
dataFrame = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
print("Original DataFrame...")
print(dataFrame)
# Remove a row with pdpipe
dataFrame = pdp.ValDrop(['Jaguar'], 'Car').apply(dataFrame)
print("\nDataFrame after removing 'Jaguar' row...")
print(dataFrame)
# Remove a column with pdpipe
resDF = pdp.ColDrop("Units").apply(dataFrame)
print("\nDataFrame after removing 'Units' column...")
print(resDF)
Original DataFrame...
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
3 Mustang 80
4 Bentley 110
5 Jaguar 90
DataFrame after removing 'Jaguar' row...
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
3 Mustang 80
4 Bentley 110
DataFrame after removing 'Units' column...
Car
0 BMW
1 Lexus
2 Audi
3 Mustang
4 Bentley
Conclusion
The pdpipe library provides a clean pipeline approach for DataFrame operations. Use ColDrop() to remove columns and ValDrop() to remove rows based on specific values.
