Python - How to Group Pandas DataFrame by Month?

PythonServer Side ProgrammingProgramming

We will group Pandas DataFrame using the groupby. Select the column to be used using the grouper function. We will group month-wise and calculate sum of Registration Price monthly for our example shown below for Car Sale Records.

At first, let’s say the following is our Pandas DataFrame with three columns −

dataFrame = pd.DataFrame(
   {
      "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],

      "Date_of_Purchase": [
         pd.Timestamp("2021-06-10"),
         pd.Timestamp("2021-07-11"),
         pd.Timestamp("2021-06-25"),      
         pd.Timestamp("2021-06-29"),
         pd.Timestamp("2021-03-20"),
         pd.Timestamp("2021-01-22"),  
         pd.Timestamp("2021-01-06"),
         pd.Timestamp("2021-01-04"),
         pd.Timestamp("2021-05-09")
      ],
     "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]
   }
)

Use the Grouper to select Date_of_Purchase column within groupby() function. The frequency freq is set ‘M’ to group by month-wise −

print("\nGroup Dataframe by month...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='M')).sum())

Example

Following is the code −

import pandas as pd

# dataframe with one of the columns as Date_of_Purchase
dataFrame = pd.DataFrame(
    {
      "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],

      "Date_of_Purchase": [
         pd.Timestamp("2021-06-10"),
         pd.Timestamp("2021-07-11"),
         pd.Timestamp("2021-06-25"),
         pd.Timestamp("2021-06-29"),
         pd.Timestamp("2021-03-20"),
         pd.Timestamp("2021-01-22"),
         pd.Timestamp("2021-01-06"),
         pd.Timestamp("2021-01-04"),
         pd.Timestamp("2021-05-09")
      ],

      "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]
   }
)

print"DataFrame...\n",dataFrame

# Grouper to select Date_of_Purchase column within groupby function
print"\nGroup Dataframe by month...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='M')).sum()

Output

This will produce the following output. The Registration price of every month gets calculated −

DataFrame...
     Car     Date_of_Purchase     Reg_Price
0     Audi     2021-06-10            1000
1    Lexus     2021-07-11            1400
2    Tesla     2021-06-25            1100
3 Mercedes     2021-06-29             900
4      BMW     2021-03-20            1700
5   Toyota     2021-01-22            1800
6   Nissan     2021-01-06            1300
7  Bentley     2021-01-04            1150
8  Mustang     2021-05-09            1350
 Group Dataframe by month...
                    Reg_Price
Date_of_Purchase
2021-01-31             4250.0
2021-02-28                NaN
2021-03-31             1700.0
2021-04-30                NaN
2021-05-31             1350.0
2021-06-30             3000.0
2021-07-31             1400.0
raja
Published on 09-Sep-2021 11:24:52
Advertisements