When do we use crosstab and pivot_table in Python Pandas?

In this article, we will explore when to use crosstab() and pivot_table() functions in Python Pandas. Both functions create summary tables, but they have different use cases and advantages.

When to Use crosstab vs pivot_table

The key difference lies in the input data format:

  • pivot_table() - Use when you already have a DataFrame and want to reorganize it. You pass column names as strings to specify index, columns, and values.

  • crosstab() - Use when you have separate array-like objects (lists, NumPy arrays, Series) that you want to cross-tabulate. No existing DataFrame required.

In general, use pivot_table() if you already have a DataFrame to avoid creating duplicate data structures. Use crosstab() when starting with separate arrays and focusing only on frequency analysis.

pandas.crosstab() Function

The crosstab() function computes a cross-tabulation of two or more factors. By default, it calculates frequency tables unless you specify values and an aggregation function.

Syntax

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None,
   aggfunc=None, margins=False, margins_name='All', dropna=True, normalize=False)

Parameters

Parameter Description Default Value
index Array-like, Series, or values to group by in rows Required
columns Array-like, Series, or values to group by in columns Required
values Array of values to aggregate None
aggfunc Function to aggregate data (requires 'values') count
margins Add row/column totals False
normalize Normalize by dividing by sum of values False

Advantages of crosstab()

  • Can normalize output to show percentages of row or column totals

  • No DataFrame required as input - accepts array-like objects directly

  • Default aggregation is count, making it ideal for frequency analysis

Creating a Crosstab Example

Let's create a crosstab from separate NumPy arrays ?

import pandas as pd
import numpy as np

# Creating numpy arrays
animals = np.array(["cat", "cat", "cat", "cat", "rat", "rat", "rat", "rat", "cat", "cat", "cat"])
objects = np.array(["pin", "pin", "pin", "bat", "pin", "pin", "pin", "bat", "bat", "bat", "pin"])
items = np.array(["book", "book", "car", "book", "book", "car", "car", "book", "car", "car", "car"])

# Creating cross-tabulation
result = pd.crosstab(animals, [objects, items], 
                    rownames=['Animals'], 
                    colnames=['Objects', 'Items'])
print(result)
Objects  bat       pin
Items   book car book car
Animals
cat        1   2    2   2
rat        1   0    1   2

pandas.pivot_table() Function

The pivot_table() function creates a spreadsheet-style pivot table from a DataFrame. It's more suitable when you already have structured data in DataFrame format.

Syntax

pandas.pivot_table(data, values=None, index=None, columns=None, 
                  aggfunc='mean', fill_value=None, margins=False, 
                  dropna=True, margins_name='All')

Creating a DataFrame and Pivot Table

Let's create a DataFrame and then generate a pivot table from it ?

import pandas as pd

# Creating a DataFrame
df = pd.DataFrame({
    'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'],
    'Department': ['IT', 'Finance', 'IT', 'IT', 'Finance'], 
    'Age': [25, 30, 28, 35, 40],
    'Salary': [50000, 55000, 48000, 60000, 65000]
})

print("Original DataFrame:")
print(df)
Original DataFrame:
    Name Department  Age  Salary
0  Virat         IT   25   50000
1  Rohit    Finance   30   55000
2  Meera         IT   28   48000
3   Nick         IT   35   60000
4   Sana    Finance   40   65000

Now let's create a pivot table to analyze average salary by department ?

import pandas as pd

df = pd.DataFrame({
    'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'],
    'Department': ['IT', 'Finance', 'IT', 'IT', 'Finance'], 
    'Age': [25, 30, 28, 35, 40],
    'Salary': [50000, 55000, 48000, 60000, 65000]
})

# Creating pivot table
pivot_result = pd.pivot_table(df, values='Salary', index='Department', aggfunc='mean')
print(pivot_result)
            Salary
Department        
Finance      60000
IT           52667

Comparison

Feature crosstab() pivot_table()
Input Format Array-like objects DataFrame required
Default Aggregation Count (frequency) Mean
Normalization Yes (built-in) No
Best Use Case Frequency analysis Existing DataFrame restructuring

Conclusion

Use crosstab() when working with separate arrays and need frequency analysis or normalization. Use pivot_table() when restructuring existing DataFrames with custom aggregations like mean, sum, or count.

Updated on: 2026-03-26T22:35:16+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements