Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Subtract Two Columns in Pandas DataFrame?
When working with Pandas DataFrames, you often need to perform arithmetic operations between columns. One common operation is subtracting two columns. This guide explores four different methods to subtract columns in a Pandas DataFrame, from simple arithmetic operators to specialized Pandas functions.
Method 1: Using Simple Arithmetic Operator (-)
The most straightforward approach is using the standard subtraction operator. This method is intuitive and commonly used for basic column operations.
Syntax
result = dataframe['column1'] - dataframe['column2']
Example
Let's create a DataFrame with sales and costs data and calculate the profit by subtracting costs from sales ?
import pandas as pd
data = {'Sales': [100, 200, 300, 400], 'Costs': [30, 80, 150, 200]}
df = pd.DataFrame(data)
df['Profit'] = df['Sales'] - df['Costs']
print(df)
Sales Costs Profit 0 100 30 70 1 200 80 120 2 300 150 150 3 400 200 200
Method 2: Using the sub() Method
The sub() method is a Pandas function that provides more flexibility for subtraction operations, especially when handling missing values.
Syntax
result = dataframe['column1'].sub(dataframe['column2'])
Example
import pandas as pd
data = {'Sales': [100, 200, 300, 400], 'Costs': [30, 80, 150, 200]}
df = pd.DataFrame(data)
df['Profit'] = df['Sales'].sub(df['Costs'])
print(df)
Sales Costs Profit 0 100 30 70 1 200 80 120 2 300 150 150 3 400 200 200
Method 3: Using apply() with Lambda Function
The apply() method with a lambda function allows for more complex row-wise operations and provides flexibility for custom calculations.
Syntax
result = dataframe.apply(lambda row: row['column1'] - row['column2'], axis=1)
Example
import pandas as pd
data = {'Sales': [100, 200, 300, 400], 'Costs': [30, 80, 150, 200]}
df = pd.DataFrame(data)
df['Profit'] = df.apply(lambda row: row['Sales'] - row['Costs'], axis=1)
print(df)
Sales Costs Profit 0 100 30 70 1 200 80 120 2 300 150 150 3 400 200 200
Method 4: Using subtract() Function
The subtract() function is similar to sub() and provides additional parameters for handling alignment and missing values.
Syntax
result = dataframe['column1'].subtract(dataframe['column2'])
Example
import pandas as pd
data = {'Sales': [100, 200, 300, 400], 'Costs': [30, 80, 150, 200]}
df = pd.DataFrame(data)
df['Profit'] = df['Sales'].subtract(df['Costs'])
print(df)
Sales Costs Profit 0 100 30 70 1 200 80 120 2 300 150 150 3 400 200 200
Comparison
| Method | Performance | Best For |
|---|---|---|
| Arithmetic Operator (-) | Fastest | Simple, straightforward operations |
| sub() Method | Fast | Better control over missing values |
| apply() with Lambda | Slower | Complex row-wise calculations |
| subtract() Function | Fast | Advanced alignment options |
Conclusion
For simple column subtraction, use the arithmetic operator (-) as it's the most readable and efficient. Use sub() or subtract() when you need better control over missing values, and apply() with lambda for complex calculations involving multiple columns.
