Python Pandas - DateOffset



The DateOffset object in Pandas is a powerful tool for managing date and time shifts based on calendar rules, business days, holidays, and more. Which works similar to a Timedelta, but with awareness of the calendar duration rules.

This means that while Timedelta shifts are based purely on a fixed number of seconds, hours, or days, whereas DateOffset objects take into account specific calendar-based rules, such as daylight saving time, business days, different month lengths, holidays, and various frequency aliases.

In this tutorial, we will learn about how to use DateOffset in Pandas, including its various subclasses and applications.

Difference Between Timedelta and DateOffset

Both DateOffset and Timedelta are used to perform date and time arithmetic, Timedelta shifts times purely based on fixed time units, while DateOffset respects the calendar system (e.g., months, quarters, or business days). This makes DateOffset particularly useful for working with business calendars, or any situation where you need to consider the varying month lengths, leap years, and holidays. Here are their key differences −

  • Timedelta: Represents a fixed duration (e.g., 1 day = 24 hours), ignoring calendar specifics.

  • DateOffset: Takes into account calendar-based shifts (e.g., the next day considering daylight saving time or the next business day).

Example

The following example demonstrates the difference between the DateOffset and Timedelta.

import pandas as pd

# A timestamp that falls within a daylight savings transition
ts = pd.Timestamp("2016-10-30 00:00:00", tz="Europe/Helsinki")

# Display the Input Timestamp
print("Input Timestamp:")
print(ts)

# Using Timedelta to increment date by 1 day
print('\nOutput Timestamp after increment it by 1 day using the "Timedelta":') 
print( ts + pd.Timedelta(days=1))

# Using DateOffset to increment date by 1 day
print('\nOutput Timestamp after increment it by 1 day using the "DateOffset":') 
print(ts + pd.DateOffset(days=1))

Following is the output of the above code −

Input Timestamp:
2016-10-30 00:00:00+03:00

Output Timestamp after increment it by 1 day using the "Timedelta":
2016-10-30 23:00:00+02:00

Output Timestamp after increment it by 1 day using the "DateOffset":
2016-10-31 00:00:00+02:00

From the above output you can confirm that, DateOffset correctly adjusts for the daylight saving time shift, while Timedelta simply adds 24 hours.

DateOffset Subclasses

Pandas provides various subclasses of DateOffset for handling different types of time shifts. Each subclass is associated with a frequency string or alias, which defines how the offset behaves. Some of the commonly used subclasses include −

S.No Date Offset Frequency String Description
1 DateOffset None Default time interval, typically 24 hours
2 BDay or BusinessDay B Adjusts for business days (weekdays only)
3 CDay or CustomBusinessDay C Custom business days
4 Week W One week, optionally anchored on a day of the week
5 WeekOfMonth WOM The x-th day of the y-th week of each month
6 LastWeekOfMonth LWOM The x-th day of the last week of each month
7 MonthEnd ME End of the calendar month
8 MonthBegin MS Start of the calendar month
9 BMonthEnd or BusinessMonthEnd BME End of the business month
10 BMonthBegin or BusinessMonthBegin BMS Start of the business month
11 CBMonthEnd or CustomBusinessMonthEnd CBME Customized business month end
12 CBMonthBegin or CustomBusinessMonthBegin CBMS Customized business month begin
13 SemiMonthEnd SME 15th (or other day_of_month) and calendar month end
14 SemiMonthBegin SMS 15th (or other day_of_month) and calendar month begin
15 QuarterEnd QE End of a calendar quarter
16 QuarterBegin QS Beginning of a calendar quarter
17 BQuarterEnd BQE End of the quarter on a business day
18 BQuarterBegin BQS Start of the quarter on a business day
19 FY5253Quarter REQ Retail (aka 52-53 week) quarter
20 YearEnd YE End of the calendar year
21 YearBegin YS or BYS Start of the calendar year
22 BYearEnd BYE Business the year begin
23 BYearBegin BYS Business the year begin
24 FY5253 RE Retail (aka 52-53 week) year
25 Easter None Easter holiday
26 BusinessHour bh Single business hour
27 CustomBusinessHour cbh Customized interval of business hours
28 Day D One absolute day
29 Hour h One hourly interval
30 Minute min One-minute interval
31 Second s One-second interval
32 Milli ms One millisecond interval
33 Micro us One microsecond interval
34 Nano ns One nanosecond interval

Example

The following example demonstrates using the BusinessDay offset for shifting by 2 business days from the input date.

import pandas as pd

# Define a timestamp on a Friday
ts = pd.Timestamp("2024-11-05")

# Use the BusinessDay offset to shift by 2 business days
two_business_days = 2 * pd.offsets.BDay()

# Shift the date
shifted_date = ts + two_business_days
print("Output Day after shifting to the 2 BusinessDays:", shifted_date)

Following is the output of the above code −

Output Day after shifting to the 2 BusinessDays: 2024-11-07 00:00:00

Using the DateOffset Class Directly

The DateOffset() class can be applied directly to Pandas date time related objects. By using the arithmetic operators with the DateOffset, you can adjust the timestamp by a specific period, whether it's days, months, or years.

Example

The following example demonstrates adding a timestamp to the 3 months DateOffset using the addition operator.

import pandas as pd

# Define a timestamp on a Friday
ts = pd.Timestamp("2024-11-05")

# Add 3 months to the timestamp
result = ts + pd.DateOffset(months=3)
print('Output after adding a DateOffset with 3 months:', result)

Following is the output of the above code −

Output after adding a DateOffset with 3 months: 2025-02-05 00:00:00

Rollforward and Rollback Methods

Pandas provides two methods for rolling dates, which are rollforward() and rollback(). These methods are useful when you want to move a date forward or backward to a valid offset, such as the next business day.

Example: Using the rollforward() method

The following example demonstrates the working of the rollforward() method.

import pandas as pd

# Define a timestamp on a Friday
ts = pd.Timestamp("2024-11-02")

# Move to next weekday
offset = pd.offsets.BusinessDay() 

# Roll forward to the next valid business hour
result = offset.rollforward(ts)

print('Output after rolling forward:', result)

Following is the output of the above code −

Output after rolling forward: 2024-11-04 00:00:00

Example: Using the rollback() method

The following example demonstrates the working of the rollback() method.

import pandas as pd

# Define a timestamp on a Friday
ts = pd.Timestamp("2024-11-03")

# Move to next weekday
offset = pd.offsets.BusinessDay() 

# Roll forward to the next valid business hour
result = offset.rollback(ts)

print('Output after rolling backward:', result)

Following is the output of the above code −

Output after rolling backward: 2024-11-01 00:00:00

Parameterized Offsets

Some DateOffset objects can be parameterized to modify their behavior. For example, the Week offset allows you to specify the weekday on which the week should end, and the YearEnd offset can accept a month parameter to specify the month that marks the end of the year.

Example

Here is the example of using the parameterized offsets object.

import pandas as pd

# Define a timestamp on a Friday
ts = pd.Timestamp("2024-11-03")

# Shift a date to the Weekday 4
result = ts + pd.offsets.Week(weekday=4)

print('Output after applying the Parameterized Offsets:', result)

Following is the output of the above code −

Output after applying the Parameterized Offsets: 2024-11-08 00:00:00

Using DateOffsets with Pandas Series or DatetimeIndex

The DateOffset can be used not only with individual Timestamp objects but also with Series or DatetimeIndex to apply an offset to each element.

Example: DateOffsets with DatetimeIndex

The following example uses the DateOffset with the DatetimeIndex.

import pandas as pd

# Create a DatetimeIndex
rng = pd.date_range("2024-01-01", "2024-01-03")

# Shift the entire range by 2 months
result = rng + pd.DateOffset(months=2)
print("Generated DatetimeIndex with the DateOffset:")
print(result)

Following is the output of the above code −

Generated DatetimeIndex with the DateOffset:
DatetimeIndex(['2024-03-01', '2024-03-02', '2024-03-03'], dtype='datetime64[ns]', freq=None)

Example: DateOffsets with Series

The following example demonstrates using the DateOffset with the Series.

import pandas as pd

# Create a Series of timestamps
s = pd.Series(pd.date_range("2024-01-01", "2024-01-03"))

# Shift each element in the Series by 2 months
result = s + pd.DateOffset(months=2)

print("Generated Series with the DateOffset:")
print(result)

Following is the output of the above code −

Generated Series with the DateOffset:
0   2024-03-01
1   2024-03-02
2   2024-03-03
dtype: datetime64[ns]
Advertisements