Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Group Pandas DataFrame By Date and Time?
In data analysis and manipulation, grouping data by date and time is essential for temporal aggregations and extracting time-based insights. Pandas provides powerful tools to group DataFrames by various time frequencies using pd.Grouper().
Syntax
The basic syntax for grouping a DataFrame by date and time ?
dataframe.groupby(pd.Grouper(key='column_name', freq='frequency')).operation()
Where dataframe is the Pandas DataFrame object, column_name is the datetime column, freq specifies the grouping frequency (e.g., 'D' for daily, 'M' for monthly, 'H' for hourly), and operation() is the aggregation function.
Algorithm
Follow these steps to group a Pandas DataFrame by date and time ?
Import libraries Import pandas for data manipulation
Load data Create or load DataFrame with datetime data
Convert to datetime Use
pd.to_datetime()to ensure proper datetime formatGroup by frequency Use
groupby()withpd.Grouper()Apply aggregation Perform operations like sum(), mean(), count()
Sample Data
Let's create sample data to demonstrate the grouping techniques ?
import pandas as pd
import numpy as np
# Create sample data
data = {
'datetime': [
'2023-07-01 08:00:00', '2023-07-01 12:00:00',
'2023-07-02 09:00:00', '2023-07-02 14:00:00',
'2023-07-03 10:00:00', '2023-07-03 16:00:00',
'2023-07-04 11:00:00', '2023-07-04 18:00:00'
],
'value': [10, 5, 7, 3, 8, 2, 6, 4],
'category': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
}
df = pd.DataFrame(data)
df['datetime'] = pd.to_datetime(df['datetime'])
print(df)
datetime value category
0 2023-07-01 08:00:00 10 A
1 2023-07-01 12:00:00 5 B
2 2023-07-02 09:00:00 7 A
3 2023-07-02 14:00:00 3 B
4 2023-07-03 10:00:00 8 A
5 2023-07-03 16:00:00 2 B
6 2023-07-04 11:00:00 6 A
7 2023-07-04 18:00:00 4 B
Grouping by Daily Frequency
Group the DataFrame by daily frequency and calculate the sum of values for each day ?
import pandas as pd
# Create sample data
data = {
'datetime': [
'2023-07-01 08:00:00', '2023-07-01 12:00:00',
'2023-07-02 09:00:00', '2023-07-02 14:00:00',
'2023-07-03 10:00:00', '2023-07-03 16:00:00',
'2023-07-04 11:00:00', '2023-07-04 18:00:00'
],
'value': [10, 5, 7, 3, 8, 2, 6, 4]
}
df = pd.DataFrame(data)
df['datetime'] = pd.to_datetime(df['datetime'])
# Group by daily frequency and sum values
daily_grouped = df.groupby(pd.Grouper(key='datetime', freq='D'))['value'].sum()
print(daily_grouped)
datetime 2023-07-01 15 2023-07-02 10 2023-07-03 10 2023-07-04 10 Freq: D, Name: value, dtype: int64
Grouping by Hourly Frequency
Group the DataFrame by hourly frequency and calculate the mean of values for each hour ?
import pandas as pd
# Create sample data with more hourly entries
data = {
'datetime': [
'2023-07-01 08:00:00', '2023-07-01 09:00:00',
'2023-07-01 09:30:00', '2023-07-01 10:00:00',
'2023-07-01 10:15:00', '2023-07-01 11:00:00'
],
'value': [10, 15, 12, 8, 6, 9]
}
df = pd.DataFrame(data)
df['datetime'] = pd.to_datetime(df['datetime'])
# Group by hourly frequency and calculate mean
hourly_grouped = df.groupby(pd.Grouper(key='datetime', freq='H'))['value'].mean()
print(hourly_grouped)
datetime 2023-07-01 08:00:00 10.0 2023-07-01 09:00:00 13.5 2023-07-01 10:00:00 7.0 2023-07-01 11:00:00 9.0 Freq: H, Name: value, dtype: float64
Multiple Aggregations
Perform multiple aggregation operations on grouped data ?
import pandas as pd
# Create sample data
data = {
'datetime': [
'2023-07-01 08:00:00', '2023-07-01 12:00:00',
'2023-07-02 09:00:00', '2023-07-02 14:00:00',
'2023-07-03 10:00:00', '2023-07-03 16:00:00'
],
'value': [10, 5, 7, 3, 8, 2]
}
df = pd.DataFrame(data)
df['datetime'] = pd.to_datetime(df['datetime'])
# Group by daily frequency with multiple aggregations
result = df.groupby(pd.Grouper(key='datetime', freq='D'))['value'].agg(['sum', 'mean', 'count'])
print(result)
sum mean count
datetime
2023-07-01 15 7.5 2
2023-07-02 10 5.0 2
2023-07-03 10 5.0 2
Common Frequency Options
| Frequency | Description | Example |
|---|---|---|
'D' |
Daily | Group by calendar day |
'H' |
Hourly | Group by hour |
'M' |
Monthly | Group by month end |
'W' |
Weekly | Group by week |
'Q' |
Quarterly | Group by quarter |
Conclusion
Pandas provides flexible tools for grouping DataFrames by date and time using pd.Grouper() with various frequency options. This enables efficient temporal aggregations and time-series analysis for extracting meaningful insights from your data.
