Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python - How to Group Pandas DataFrame by Days?
We can group a Pandas DataFrame by days using groupby() with the Grouper function. This allows us to aggregate data over specific day intervals, such as grouping car sales data by 7-day periods.
Creating Sample Data
Let's create a DataFrame with car sales data including purchase dates and registration prices ?
import pandas as pd
# DataFrame with car sales data
dataFrame = pd.DataFrame(
{
"Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],
"Date_of_Purchase": [
pd.Timestamp("2021-06-10"),
pd.Timestamp("2021-07-11"),
pd.Timestamp("2021-06-25"),
pd.Timestamp("2021-06-29"),
pd.Timestamp("2021-03-20"),
pd.Timestamp("2021-01-22"),
pd.Timestamp("2021-01-06"),
pd.Timestamp("2021-01-04"),
pd.Timestamp("2021-05-09")
],
"Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]
}
)
print("DataFrame...")
print(dataFrame)
DataFrame...
Car Date_of_Purchase Reg_Price
0 Audi 2021-06-10 1000
1 Lexus 2021-07-11 1400
2 Tesla 2021-06-25 1100
3 Mercedes 2021-06-29 900
4 BMW 2021-03-20 1700
5 Toyota 2021-01-22 1800
6 Nissan 2021-01-06 1300
7 Bentley 2021-01-04 1150
8 Mustang 2021-05-09 1350
Grouping by 7-Day Intervals
Use pd.Grouper() with freq='7D' to group data by 7-day intervals and calculate the sum ?
import pandas as pd
# DataFrame with car sales data
dataFrame = pd.DataFrame(
{
"Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],
"Date_of_Purchase": [
pd.Timestamp("2021-06-10"),
pd.Timestamp("2021-07-11"),
pd.Timestamp("2021-06-25"),
pd.Timestamp("2021-06-29"),
pd.Timestamp("2021-03-20"),
pd.Timestamp("2021-01-22"),
pd.Timestamp("2021-01-06"),
pd.Timestamp("2021-01-04"),
pd.Timestamp("2021-05-09")
],
"Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]
}
)
# Group by 7-day intervals and sum registration prices
grouped = dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', freq='7D')).sum()
print("\nGroup DataFrame by 7 days...")
print(grouped)
Group DataFrame by 7 days...
Reg_Price
Date_of_Purchase
2021-01-04 2450.0
2021-01-18 1800.0
2021-03-15 1700.0
2021-05-03 1350.0
2021-06-07 1000.0
2021-06-21 1100.0
2021-06-28 900.0
2021-07-05 1400.0
How pd.Grouper Works
The pd.Grouper() function creates time-based groups:
- key: The date column to group by
- freq: The frequency interval (7D = 7 days)
- axis: 0 for rows (default)
Different Day Intervals
You can use different frequencies for grouping ?
import pandas as pd
dataFrame = pd.DataFrame(
{
"Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW"],
"Date_of_Purchase": [
pd.Timestamp("2021-06-10"),
pd.Timestamp("2021-06-15"),
pd.Timestamp("2021-06-25"),
pd.Timestamp("2021-06-29"),
pd.Timestamp("2021-07-05")
],
"Reg_Price": [1000, 1400, 1100, 900, 1700]
}
)
# Group by 14-day intervals
grouped_14d = dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', freq='14D')).sum()
print("Group by 14 days:")
print(grouped_14d)
# Group by 30-day intervals
grouped_30d = dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', freq='30D')).sum()
print("\nGroup by 30 days:")
print(grouped_30d)
Group by 14 days:
Reg_Price
Date_of_Purchase
2021-06-07 2400.0
2021-06-21 2000.0
2021-07-05 1700.0
Group by 30 days:
Reg_Price
Date_of_Purchase
2021-06-07 4400.0
2021-07-05 1700.0
Conclusion
Use pd.Grouper() with frequency parameters like '7D', '14D', or '30D' to group DataFrames by day intervals. This is useful for time-series analysis and aggregating data over specific periods.
Advertisements
