How to Create a Pivot Table in Python using Pandas?

Python Pandas Server Side Programming Programming

A pivot table is a powerful data analysis tool that allows you to summarize and aggregate data based on different dimensions. In Python, you can create pivot tables using the pandas library, which provides flexible and efficient tools for data manipulation and analysis.

To create a pivot table in pandas, you first need to have a dataset in a pandas DataFrame. You can load data into a DataFrame from various sources such as CSV files, Excel spreadsheets, SQL databases, and more.

Once you have your data in a DataFrame, you can use the pandas pivot_table() function to create a pivot table. Here is its syntax ?

dataframe.pivot(self, index=None, columns=None, values=None, aggfunc)

The pivot_table() function takes several parameters, including the DataFrame to be used, the index column(s), the column(s) to be used as columns in the pivot table, and the values column(s) to be aggregated. You can also specify the aggregation function(s) to be used, such as sum, mean, max, min, etc.

Before we dig deep into working with the pivot and the pivot_table() function, let's first create a dataframe that we will make use of.

A DataFrame in Pandas

A DataFrame in pandas is a two-dimensional labelled data structure with columns of potentially different types. It is the primary data structure used in pandas for data manipulation and analysis.

A DataFrame can be thought of as a spreadsheet or a SQL table, with rows and columns. It allows for easy handling and manipulation of data, including indexing, selecting, filtering, merging, and grouping.

Consider the code shown below. This code creates a DataFrame object named df with four columns, 'Product', 'Category', 'Quantity', and 'Amount', using a Python dictionary. Each key of the dictionary corresponds to the name of the column, and its value is a list containing the values for that column.

Example

# importing pandas library
import pandas as pd

# creating a dataframe from a dictionary

# creating a column 'Product', 'Category', 'Quantity','Amount' with its values
df = pd.DataFrame({
   'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
   'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
   'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
   'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})

# print the dataframe
print(df)

Output

When you execute this code, it will produce the following output on the terminal ?

  Product  Category  Quantity Amount
0  Litchi   Fruit      8       270
1  Broccoli Vegetable  5       239
2  Banana   Fruit      3       617
3  Banana   Fruit      4       384
4  Beans    Vegetable  5       626
5  Orange   Fruit      9       610
6  Mango    Fruit      11      62
7  Banana   Fruit       8      90

Creating a Pivot Table Using Pandas

Now let's create a pivot table of total sales using the pivot_table() function. Consider the code shown below.

Example

# importing pandas library
import pandas as pd

# creating a dataframe from a dictionary

# creating a column 'Product', 'Category', 'Quantity','Amount' with its values
df = pd.DataFrame({
   'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
   'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
   'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
   'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})

# creating pivot table of total sales

# product-wise
pivot = df.pivot_table(index =['Product'], values =['Amount'], aggfunc ='sum')
print(pivot)

# print the dataframe
print(df)

Explanation

It creates a DataFrame object named df with four columns: 'Product', 'Category', 'Quantity', and 'Amount'. Each column has its own values, and they are created using a Python dictionary.
After that, the code creates a pivot table that groups the sales data by product, and calculates the total sales for each product using the pivot_table() function.
Finally, the pivot table is printed to the console to display the total sales data for each product, and the original DataFrame is also printed to the console to show the original data that the pivot table was generated from.

Output

On execution, you will get the following output on the terminal ?

Product  Amount
Banana    1091
Beans     626
Broccoli  239
Litchi    270
Mango     62
Orange    610 
  Product  Category  Quantity Amount
0  Litchi   Fruit      8       270
1  Broccoli Vegetable  5       239
2  Banana   Fruit      3       617
3  Banana   Fruit      4       384
4  Beans    Vegetable  5       626
5  Orange   Fruit      9       610
6  Mango    Fruit      11      62
7  Banana   Fruit       8      90

Conclusion

In conclusion, creating pivot tables in Python using the pandas library is a powerful way to analyze tabular data and extract meaningful insights. By grouping data and calculating aggregate values, pivot tables can help you identify patterns and trends in your data that might be difficult to see otherwise. With the flexibility and ease of use offered by pandas, creating pivot tables has never been easier.

By following the steps outlined in this tutorial, you should now have a solid foundation for creating and working with pivot tables in Python.

Mukul Latiyan

Updated on: 2023-04-20T14:32:37+05:30

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started