Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Write a program to truncate a dataframe time series data based on index value
When working with time series data in pandas, you often need to extract a specific date range from your DataFrame. The truncate() method allows you to filter data based on index values, making it useful for time-based filtering.
Understanding DataFrame Truncation
The truncate() method filters DataFrame rows based on index values using before and after parameters. For time series data, this is particularly useful when your DataFrame has a datetime index.
Creating Sample Time Series Data
Let's start by creating a DataFrame with time series data ?
import pandas as pd
# Create DataFrame with ID column
data = {'Id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
df = pd.DataFrame(data)
# Generate weekly dates starting from 2020-01-05
df['time_series'] = pd.date_range('01/01/2020', periods=10, freq='W')
print("Original DataFrame:")
print(df)
Original DataFrame: Id time_series 0 1 2020-01-05 1 2 2020-01-12 2 3 2020-01-19 3 4 2020-01-26 4 5 2020-02-02 5 6 2020-02-09 6 7 2020-02-16 7 8 2020-02-23 8 9 2020-03-01 9 10 2020-03-08
Method 1: Using truncate() with Index-based Filtering
The truncate() method works on the DataFrame index. To filter by date values, we need to set the time series column as the index ?
import pandas as pd
# Create DataFrame with time series data
data = {'Id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
df = pd.DataFrame(data)
df['time_series'] = pd.date_range('01/01/2020', periods=10, freq='W')
# Set time_series as index for truncation
df_indexed = df.set_index('time_series')
# Truncate data between specific dates
result = df_indexed.truncate(before='2020-01-12', after='2020-01-12')
print("Truncated DataFrame:")
print(result)
Truncated DataFrame:
Id
time_series
2020-01-12 2
Method 2: Using Boolean Indexing
For more flexible date filtering without changing the index, use boolean indexing ?
import pandas as pd
# Create DataFrame with time series data
data = {'Id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
df = pd.DataFrame(data)
df['time_series'] = pd.date_range('01/01/2020', periods=10, freq='W')
# Filter using boolean indexing
start_date = '2020-01-12'
end_date = '2020-02-09'
filtered_df = df[(df['time_series'] >= start_date) & (df['time_series'] <= end_date)]
print("Filtered DataFrame:")
print(filtered_df)
Filtered DataFrame: Id time_series 1 2 2020-01-12 2 3 2020-01-19 3 4 2020-01-26 4 5 2020-02-02 5 6 2020-02-09
Comparison of Methods
| Method | Requires Index Change? | Best For |
|---|---|---|
truncate() |
Yes | Simple start/end filtering |
| Boolean indexing | No | Complex conditions, multiple columns |
Conclusion
Use truncate() when you need simple date range filtering with datetime index. For more flexible filtering without changing the DataFrame structure, boolean indexing is more suitable and intuitive.
