Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to plot aggregated by date pandas dataframe?
When working with time-series data in pandas, you often need to aggregate data by date and visualize the results. This involves grouping data by date periods and plotting the aggregated values using matplotlib.
Basic Date Aggregation and Plotting
Here's how to create and plot a date-aggregated DataFrame ?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Set figure size
plt.figure(figsize=(10, 6))
# Create sample data with dates and values
dates = pd.date_range("2021-01-01", periods=10, freq='D')
values = np.random.randint(10, 100, 10)
df = pd.DataFrame({'date': dates, 'value': values})
print("Original DataFrame:")
print(df.head())
Original DataFrame:
date value
0 2021-01-01 64
1 2021-01-02 52
2 2021-01-03 93
3 2021-01-04 43
4 2021-01-05 29
Aggregating Data by Date
Group the data by date and apply aggregation functions ?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Create sample data with multiple entries per date
dates = pd.date_range("2021-01-01", periods=20, freq='12H') # Every 12 hours
values = np.random.randint(10, 100, 20)
df = pd.DataFrame({'date': dates, 'value': values})
# Aggregate by date (sum values for same date)
daily_agg = df.groupby(df['date'].dt.date)['value'].sum().reset_index()
daily_agg['date'] = pd.to_datetime(daily_agg['date'])
print("Aggregated DataFrame:")
print(daily_agg.head())
Aggregated DataFrame:
date value
0 2021-01-01 108
1 2021-01-02 156
2 2021-01-03 127
3 2021-01-04 145
4 2021-01-05 163
Plotting the Aggregated Data
Create different types of plots for the aggregated data ?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Create and aggregate data
dates = pd.date_range("2021-01-01", periods=20, freq='12H')
values = np.random.randint(10, 100, 20)
df = pd.DataFrame({'date': dates, 'value': values})
daily_agg = df.groupby(df['date'].dt.date)['value'].sum().reset_index()
daily_agg['date'] = pd.to_datetime(daily_agg['date'])
# Create subplots
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))
# Line plot
daily_agg.plot(x='date', y='value', kind='line', ax=ax1, marker='o')
ax1.set_title('Daily Aggregated Values - Line Plot')
ax1.set_ylabel('Value')
# Bar plot
daily_agg.plot(x='date', y='value', kind='bar', ax=ax2, color='skyblue')
ax2.set_title('Daily Aggregated Values - Bar Plot')
ax2.set_ylabel('Value')
ax2.tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.show()
Advanced Aggregation Methods
Use different aggregation functions and group by various time periods ?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Create sample data spanning multiple months
dates = pd.date_range("2021-01-01", periods=90, freq='D')
values = np.random.randint(50, 200, 90)
df = pd.DataFrame({'date': dates, 'value': values})
# Multiple aggregation methods
monthly_agg = df.groupby(df['date'].dt.to_period('M'))['value'].agg(['sum', 'mean', 'count']).reset_index()
monthly_agg['date'] = monthly_agg['date'].dt.to_timestamp()
print("Monthly Aggregation:")
print(monthly_agg)
# Plot multiple aggregations
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
monthly_agg.plot(x='date', y='sum', kind='bar', ax=axes[0], color='lightcoral')
axes[0].set_title('Monthly Sum')
monthly_agg.plot(x='date', y='mean', kind='bar', ax=axes[1], color='lightblue')
axes[1].set_title('Monthly Average')
monthly_agg.plot(x='date', y='count', kind='bar', ax=axes[2], color='lightgreen')
axes[2].set_title('Monthly Count')
plt.tight_layout()
plt.show()
Monthly Aggregation:
date sum mean count
0 2021-01-01 3823 123.322581 31
1 2021-02-01 3234 115.500000 28
2 2021-03-01 3756 121.161290 31
Comparison of Aggregation Methods
| Method | Use Case | Example |
|---|---|---|
sum() |
Total values per period | Daily sales totals |
mean() |
Average values per period | Average temperature per day |
count() |
Number of records per period | Number of transactions per day |
max() |
Peak values per period | Highest stock price per day |
Conclusion
To plot aggregated pandas data by date, use groupby() with date columns and appropriate aggregation functions like sum() or mean(). Use plot() with different kind parameters like 'line' or 'bar' for various visualization styles.
Advertisements
