Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to Groupby values count on the Pandas DataFrame
To perform groupby value counts in Pandas, use the groupby(), size(), and unstack() methods. This technique helps you count occurrences of grouped data and reshape the results into a cross-tabulation format.
Creating the DataFrame
First, let's create a sample DataFrame with product information ?
import pandas as pd
# Create a DataFrame with 3 columns
dataFrame = pd.DataFrame({
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
print("Original DataFrame:")
print(dataFrame)
Original DataFrame: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Electronics Camera 20 4 Computer Graphic Card 25 5 Mobile Phone Earphone 50
Using groupby() with size() and unstack()
The size() method counts the number of rows in each group, while unstack() reshapes the data by moving one index level to columns ?
import pandas as pd
# Create the DataFrame
dataFrame = pd.DataFrame({
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Group by all columns and count occurrences
result = dataFrame.groupby(['Product Category', 'Product Name', 'Quantity']).size().unstack(fill_value=0)
print("Grouped and unstacked DataFrame:")
print(result)
Grouped and unstacked DataFrame:
Quantity 10 20 25 50
Product Category Product Name
Computer Graphic Card 0 0 1 0
Keyboard 1 0 0 0
Electronics Camera 0 1 0 0
SmartTV 1 0 0 0
Mobile Phone Charger 0 0 0 1
Earphone 0 0 0 1
Alternative Approach Using value_counts()
For simpler groupby value counting, you can use value_counts() directly ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Count occurrences of each category
category_counts = dataFrame['Product Category'].value_counts()
print("Category counts:")
print(category_counts)
Category counts: Product Category Computer 2 Electronics 2 Mobile Phone 2 Name: count, dtype: int64
Key Methods Comparison
| Method | Purpose | Output Format |
|---|---|---|
size() |
Count rows in each group | Series |
unstack() |
Reshape index to columns | Cross-tabulation |
value_counts() |
Count unique values | Series |
Conclusion
Use groupby().size().unstack() to create cross-tabulation views of grouped data counts. The fill_value=0 parameter ensures missing combinations show as zero rather than NaN.
Advertisements
