Article Categories

Selected Reading

When do we use crosstab and pivot_table in Python Pandas?

Python Server Side Programming Programming

In this article, we will explore when to use crosstab() and pivot_table() functions in Python Pandas. Both functions create summary tables, but they have different use cases and advantages.

When to Use crosstab vs pivot_table

The key difference lies in the input data format:

pivot_table() - Use when you already have a DataFrame and want to reorganize it. You pass column names as strings to specify index, columns, and values.
crosstab() - Use when you have separate array-like objects (lists, NumPy arrays, Series) that you want to cross-tabulate. No existing DataFrame required.

In general, use pivot_table() if you already have a DataFrame to avoid creating duplicate data structures. Use crosstab() when starting with separate arrays and focusing only on frequency analysis.

pandas.crosstab() Function

The crosstab() function computes a cross-tabulation of two or more factors. By default, it calculates frequency tables unless you specify values and an aggregation function.

Syntax

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None,
   aggfunc=None, margins=False, margins_name='All', dropna=True, normalize=False)

Parameters

Parameter	Description	Default Value
index	Array-like, Series, or values to group by in rows	Required
columns	Array-like, Series, or values to group by in columns	Required
values	Array of values to aggregate	None
aggfunc	Function to aggregate data (requires 'values')	count
margins	Add row/column totals	False
normalize	Normalize by dividing by sum of values	False

Advantages of crosstab()

Can normalize output to show percentages of row or column totals
No DataFrame required as input - accepts array-like objects directly
Default aggregation is count, making it ideal for frequency analysis

Creating a Crosstab Example

Let's create a crosstab from separate NumPy arrays ?

import pandas as pd
import numpy as np

# Creating numpy arrays
animals = np.array(["cat", "cat", "cat", "cat", "rat", "rat", "rat", "rat", "cat", "cat", "cat"])
objects = np.array(["pin", "pin", "pin", "bat", "pin", "pin", "pin", "bat", "bat", "bat", "pin"])
items = np.array(["book", "book", "car", "book", "book", "car", "car", "book", "car", "car", "car"])

# Creating cross-tabulation
result = pd.crosstab(animals, [objects, items], 
                    rownames=['Animals'], 
                    colnames=['Objects', 'Items'])
print(result)

Objects  bat       pin
Items   book car book car
Animals
cat        1   2    2   2
rat        1   0    1   2

pandas.pivot_table() Function

The pivot_table() function creates a spreadsheet-style pivot table from a DataFrame. It's more suitable when you already have structured data in DataFrame format.

Syntax

pandas.pivot_table(data, values=None, index=None, columns=None, 
                  aggfunc='mean', fill_value=None, margins=False, 
                  dropna=True, margins_name='All')

Creating a DataFrame and Pivot Table

Let's create a DataFrame and then generate a pivot table from it ?

import pandas as pd

# Creating a DataFrame
df = pd.DataFrame({
    'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'],
    'Department': ['IT', 'Finance', 'IT', 'IT', 'Finance'], 
    'Age': [25, 30, 28, 35, 40],
    'Salary': [50000, 55000, 48000, 60000, 65000]
})

print("Original DataFrame:")
print(df)

Original DataFrame:
    Name Department  Age  Salary
0  Virat         IT   25   50000
1  Rohit    Finance   30   55000
2  Meera         IT   28   48000
3   Nick         IT   35   60000
4   Sana    Finance   40   65000

Now let's create a pivot table to analyze average salary by department ?

import pandas as pd

df = pd.DataFrame({
    'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'],
    'Department': ['IT', 'Finance', 'IT', 'IT', 'Finance'], 
    'Age': [25, 30, 28, 35, 40],
    'Salary': [50000, 55000, 48000, 60000, 65000]
})

# Creating pivot table
pivot_result = pd.pivot_table(df, values='Salary', index='Department', aggfunc='mean')
print(pivot_result)

            Salary
Department        
Finance      60000
IT           52667

Comparison

Feature	crosstab()	pivot_table()
Input Format	Array-like objects	DataFrame required
Default Aggregation	Count (frequency)	Mean
Normalization	Yes (built-in)	No
Best Use Case	Frequency analysis	Existing DataFrame restructuring

Conclusion

Use crosstab() when working with separate arrays and need frequency analysis or normalization. Use pivot_table() when restructuring existing DataFrames with custom aggregations like mean, sum, or count.

Vikram Chiluka

Updated on: 2026-03-26T22:35:16+05:30

3K+ Views

Previous Next

Article Categories

When do we use crosstab and pivot_table in Python Pandas?

When to Use crosstab vs pivot_table

pandas.crosstab() Function

Syntax

Parameters

Advantages of crosstab()

Creating a Crosstab Example

pandas.pivot_table() Function

Syntax

Creating a DataFrame and Pivot Table

Comparison

Conclusion

Learn More in Our Tutorials