- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas - Appending Categories
Appending categories to the categorical data is useful for appending new valid categories without modifying existing data. In pandas, categorical data is a powerful tool for managing data with fixed, limited values and represented using the Categorical type.
It provides specialized methods for handling categorical data through the Series.cat accessor. One such method is add_categories(), which allows appending new categories to an existing categorical object.
In this tutorial, we will learn about appending categories to the Pandas categorical data using its related functionalities with the various examples.
The add_categories() Method
The Pandas Series.cat.add_categories() method allows you to add single or multiple categories at once to the existing Pandas categorical object by maintaining its original data and its order.
Syntax
Following is the syntax of this method −
Series.cat.add_categories(new_categories, *args, **kwargs)
This method accepts a single mandatory new_categories parameter for appending new categories to the existing categorical object, which accepts a single value or list-like structure representing the new categories to append.
Appending a Single Category
You can append a single category to an existing Pandas categorical object by providing the single category to the add_categories() method.
Example
This example demonstrates how to add a single new category to a categorical Series using the Pandas Series.cat.add_categories() method.
import pandas as pd
import numpy as np
# Creating a categorical Series
s = pd.Series(["cat", "dog", "mouse", "cat"], dtype="category")
# Display the Input Series
print("Original Series:")
print(s)
# Appending a new category
s = s.cat.add_categories("AA")
print("\nSeries after appending a new category:")
print(s)
When we run above program, it produces following result −
Original Series: 0 cat 1 dog 2 mouse 3 cat dtype: category Categories (3, object): ['cat', 'dog', 'mouse'] Series after appending a new category: 0 cat 1 dog 2 mouse 3 cat dtype: category Categories (4, object): ['cat', 'dog', 'mouse', 'AA']
Appending Multiple Categories
You can append multiple categories simultaneously by passing a list of new categories to the Series.cat.add_categories() method.
Example
This example shows how to add multiple new categories to the existing categorical data by providing the list with collection of categories to the new_categories parameter.
import pandas as pd
import numpy as np
# Creating a categorical Series
s = pd.Series(["cat", "dog", "mouse", "cat"], dtype="category")
# Display the Input Series
print("Original Series:")
print(s)
# Appending new categories
s = s.cat.add_categories(["Duck", "Wolf"])
print("\nSeries after appending multiple categories:")
print(s)
While executing the above code we get the following output −
Original Series: 0 cat 1 dog 2 mouse 3 cat dtype: category Categories (3, object): ['cat', 'dog', 'mouse'] Series after appending multiple categories: 0 cat 1 dog 2 mouse 3 cat dtype: category Categories (5, object): ['cat', 'dog', 'mouse', 'Duck', 'Wolf']
Appending Categories to a DataFrame Column
The Pandas cat.add_categories() method can be used to append new categories to a specific column of a DataFrame. This method works on columns that are of the category dtype, and it expands the set of categories for that column without modifying the existing data.
Example
This example demonstrates how to append categories to a specific column in a DataFrame, expanding its categories while maintaining existing data.
import pandas as pd
# Creating a DataFrame with a categorical column
df = pd.DataFrame({
"Animal": ["Cat", "Dog", "Mouse"],
"Category": pd.Series(["A", "B", "A"], dtype="category")
})
# Display the Input DataFrame
print("Original DataFrame:")
print(df)
# Appending new categories to the 'Category' column
df["Category"] = df["Category"].cat.add_categories(["C", "D"])
# Display the updated DataFrame
print("\nDataFrame after appending new categories:")
print(df)
# Checking the updated categories
print("\nUpdated categories in 'Category' column:")
print(df["Category"].cat.categories)
When we run above program, it produces following result −
Original DataFrame: Animal Category 0 Cat A 1 Dog B 2 Mouse A DataFrame after appending new categories: Animal Category 0 Cat A 1 Dog B 2 Mouse A Updated categories in 'Category' column: Index(['A', 'B', 'C', 'D'], dtype='object')
Handling Duplicate or Invalid Categories
If you are attempting to append a category that already exists in the original categorical object this method will raise a ValueError. This ensures that data integrity and prevents from unnecessary categories, meaning that appending categories does not modify existing data and focuses on expanding the list of valid categories.
Example
The following example demonstrates handling exceptions when you try to appending duplicate or invalid categories using the Series.cat.add_categories() method.
import pandas as pd
import numpy as np
# Creating a categorical Series
s = pd.Series(["cat", "dog", "mouse", "cat"], dtype="category")
# Display the Input Series
print("Original Series:")
print(s)
try:
# Appending an existing category
s = s.cat.add_categories(["cat"])
except ValueError as e:
print("\nError encountered:", e)
Following is an output of the above code −
Original Series:
0 cat
1 dog
2 mouse
3 cat
dtype: category
Categories (3, object): ['cat', 'dog', 'mouse']
Error encountered: new categories must not include old categories: {'cat'}