Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python - Typecasting Pandas into set
To typecast Pandas DataFrame columns into a set, use the set() function. This is useful for removing duplicates and performing set operations like union, intersection, and difference.
Creating a DataFrame
Let us first create a DataFrame with employee data ?
import pandas as pd
# Create DataFrame
dataFrame = pd.DataFrame(
{
"EmpName": ['John', 'Ted', 'Jacob', 'Scarlett', 'Ami', 'Ted', 'Scarlett'],
"Zone": ['North', 'South', 'South', 'East', 'West', 'East', 'North']
}
)
print("DataFrame:")
print(dataFrame)
DataFrame: EmpName Zone 0 John North 1 Ted South 2 Jacob South 3 Scarlett East 4 Ami West 5 Ted East 6 Scarlett North
Converting Pandas Series to Set
Convert individual columns to sets to remove duplicates ?
import pandas as pd
dataFrame = pd.DataFrame(
{
"EmpName": ['John', 'Ted', 'Jacob', 'Scarlett', 'Ami', 'Ted', 'Scarlett'],
"Zone": ['North', 'South', 'South', 'East', 'West', 'East', 'North']
}
)
# Convert columns to sets
emp_set = set(dataFrame.EmpName)
zone_set = set(dataFrame.Zone)
print("Employee names as set:", emp_set)
print("Zones as set:", zone_set)
Employee names as set: {'John', 'Ted', 'Jacob', 'Scarlett', 'Ami'}
Zones as set: {'North', 'South', 'East', 'West'}
Set Operations
Perform set union to combine unique values from both columns ?
import pandas as pd
dataFrame = pd.DataFrame(
{
"EmpName": ['John', 'Ted', 'Jacob', 'Scarlett', 'Ami', 'Ted', 'Scarlett'],
"Zone": ['North', 'South', 'South', 'East', 'West', 'East', 'North']
}
)
# Set union - combine all unique values
combined_set = set(dataFrame.EmpName) | set(dataFrame.Zone)
print("Union of both columns:", combined_set)
# Set intersection - common values (if any)
common_values = set(dataFrame.EmpName) & set(dataFrame.Zone)
print("Common values:", common_values)
Union of both columns: {'John', 'Ted', 'Jacob', 'Scarlett', 'Ami', 'North', 'South', 'East', 'West'}
Common values: set()
Use Cases
Converting Pandas columns to sets is useful for:
- Removing duplicates from column values
- Finding unique values across multiple columns
- Set operations like union, intersection, and difference
- Data validation and comparison tasks
Conclusion
Use set(dataframe.column) to convert Pandas columns into sets for duplicate removal and set operations. The union operator | combines unique values from multiple columns efficiently.
Advertisements
