Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Create a Pivot Table in Python using Pandas?
A pivot table is a powerful data analysis tool that allows you to summarize and aggregate data based on different dimensions. In Python, you can create pivot tables using the pandas library, which provides flexible and efficient tools for data manipulation and analysis.
To create a pivot table in pandas, you first need to have a dataset in a pandas DataFrame. You can load data into a DataFrame from various sources such as CSV files, Excel spreadsheets, SQL databases, and more.
Syntax
Once you have your data in a DataFrame, you can use the pandas pivot_table() function to create a pivot table. Here is its syntax ?
dataframe.pivot_table(index=None, columns=None, values=None, aggfunc='mean')
Parameters
The pivot_table() function takes several parameters:
- index ? Column(s) to use as row index
- columns ? Column(s) to use as column headers
- values ? Column(s) to aggregate
- aggfunc ? Aggregation function(s) like sum, mean, max, min, etc.
Creating Sample Data
Before we create pivot tables, let's first create a DataFrame with sample data ?
import pandas as pd
# Creating a DataFrame with sales data
df = pd.DataFrame({
'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})
print(df)
Product Category Quantity Amount
0 Litchi Fruit 8 270
1 Broccoli Vegetable 5 239
2 Banana Fruit 3 617
3 Banana Fruit 4 384
4 Beans Vegetable 5 626
5 Orange Fruit 9 610
6 Mango Fruit 11 62
7 Banana Fruit 8 90
Creating a Simple Pivot Table
Let's create a pivot table that shows total sales amount by product ?
import pandas as pd
df = pd.DataFrame({
'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})
# Creating pivot table grouped by Product
pivot = df.pivot_table(index=['Product'], values=['Amount'], aggfunc='sum')
print(pivot)
Amount
Product
Banana 1091
Beans 626
Broccoli 239
Litchi 270
Mango 62
Orange 610
Pivot Table by Category
You can also group by category to see total sales per category ?
import pandas as pd
df = pd.DataFrame({
'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})
# Pivot table by category with multiple aggregations
pivot = df.pivot_table(index=['Category'],
values=['Amount', 'Quantity'],
aggfunc={'Amount': 'sum', 'Quantity': 'mean'})
print(pivot)
Amount Quantity
Category
Fruit 2043 7.2
Vegetable 865 5.0
Multi-level Pivot Table
Create a more complex pivot table with both Product and Category as index ?
import pandas as pd
df = pd.DataFrame({
'Product': ['Litchi', 'Broccoli', 'Banana', 'Banana', 'Beans', 'Orange', 'Mango', 'Banana'],
'Category': ['Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Fruit', 'Fruit'],
'Quantity': [8, 5, 3, 4, 5, 9, 11, 8],
'Amount': [270, 239, 617, 384, 626, 610, 62, 90]
})
# Multi-level pivot table
pivot = df.pivot_table(index=['Category', 'Product'],
values=['Amount'],
aggfunc='sum')
print(pivot)
Amount
Category Product
Fruit Banana 1091
Litchi 270
Mango 62
Orange 610
Vegetable Beans 626
Broccoli 239
Conclusion
Pivot tables in pandas provide a powerful way to summarize and analyze data by grouping and aggregating values. Use pivot_table() with different parameters to create customized summaries that reveal patterns and insights in your data.
