Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Use Pandas apply() inplace?
The apply() function in pandas is used to apply a custom function to a DataFrame or Series. By default, apply() returns a new DataFrame or Series, but you can modify the original data in-place using specific techniques. In this article, we'll explore how to achieve in-place modifications with apply().
Syntax
DataFrame.apply(func, axis=0) Series.apply(func)
The axis parameter determines whether to apply the function row-wise (axis=1) or column-wise (axis=0). The func can be a built-in function, lambda function, or custom function.
Default Behavior ? Creating New Objects
When using apply() without in-place assignment, it returns a new DataFrame or Series, leaving the original unchanged −
import pandas as pd
# Create a DataFrame
data = {'Name': ['John', 'Emily', 'James', 'Emma'],
'Age': [28, 32, 25, 29]}
df = pd.DataFrame(data)
# Function to add a prefix to names
def add_prefix(name):
return 'Mr. ' + name
# Apply function without modifying original
new_series = df['Name'].apply(add_prefix)
print("Original DataFrame:")
print(df)
print("\nNew Series:")
print(new_series)
Original DataFrame:
Name Age
0 John 28
1 Emily 32
2 James 25
3 Emma 29
New Series:
0 Mr. John
1 Mr. Emily
2 Mr. James
3 Mr. Emma
Name: Name, dtype: object
Achieving In-Place Modification
To modify the original DataFrame in-place, assign the result of apply() back to the column or DataFrame −
import pandas as pd
# Create a DataFrame
data = {'Name': ['John', 'Emily', 'James', 'Emma'],
'Age': [28, 32, 25, 29]}
df = pd.DataFrame(data)
# Function to add a prefix to names
def add_prefix(name):
return 'Mr. ' + name
# Modify original DataFrame in-place
df['Name'] = df['Name'].apply(add_prefix)
print("Modified DataFrame:")
print(df)
Modified DataFrame:
Name Age
0 Mr. John 28
1 Mr. Emily 32
2 Mr. James 25
3 Mr. Emma 29
Working with Multiple Columns
You can apply functions to multiple columns simultaneously using axis=1 for row-wise operations −
import pandas as pd
# Create a DataFrame
data = {'Name': ['John', 'Emily', 'James', 'Emma'],
'Age': [28, 32, 25, 29],
'Salary': [50000, 60000, 45000, 55000]}
df = pd.DataFrame(data)
# Function to process multiple columns
def process_data(row):
row['Name'] = 'Mr. ' + row['Name']
row['Salary'] *= 2
return row
# Apply to specific columns in-place
df[['Name', 'Salary']] = df[['Name', 'Salary']].apply(process_data, axis=1)
print("Modified DataFrame:")
print(df)
Modified DataFrame:
Name Age Salary
0 Mr. John 28 100000
1 Mr. Emily 32 120000
2 Mr. James 25 90000
3 Mr. Emma 29 110000
Memory and Performance Considerations
| Approach | Memory Usage | Performance | Best For |
|---|---|---|---|
| New object | Higher | Safer | Data exploration |
| In-place | Lower | Faster | Large datasets |
Important: In-place modification overwrites original data permanently. Always backup important data before applying in-place operations, especially when working with large datasets where memory efficiency is crucial.
Conclusion
While pandas apply() doesn't have a built-in inplace parameter, you can achieve in-place modifications by assigning results back to the original DataFrame. Use df['column'] = df['column'].apply(func) for single columns or select multiple columns for batch operations.
