Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python Pandas - Count the number of rows in each group
Pandas groupby() operations allow you to split data into groups and count rows in each group using size(). This is useful for analyzing data distribution and finding group frequencies.
Creating the DataFrame
First, let's create a sample DataFrame with product data ?
import pandas as pd
# Create a DataFrame
dataFrame = pd.DataFrame({
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone']
})
print("DataFrame:")
print(dataFrame)
DataFrame: Product Category Quantity Product Name 0 Computer 10 Keyboard 1 Mobile Phone 50 Charger 2 Electronics 10 SmartTV 3 Electronics 20 Camera 4 Computer 25 Graphic Card 5 Mobile Phone 50 Earphone
Grouping and Counting Rows
Use groupby() to group by columns and size() to count rows in each group ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone']
})
# Group by Product Category and Quantity
grouped = dataFrame.groupby(['Product Category', 'Quantity'])
# Count rows in each group
group_counts = grouped.size()
print("Row count in each group:")
print(group_counts)
Row count in each group:
Product Category Quantity
Computer 10 1
25 1
Electronics 10 1
20 1
Mobile Phone 50 2
dtype: int64
Alternative Methods
Using count() vs size()
The count() method excludes NaN values, while size() includes all rows ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone']
})
# Using size() - counts all rows
print("Using size():")
print(dataFrame.groupby('Product Category').size())
print("\nUsing count() - excludes NaN values:")
print(dataFrame.groupby('Product Category').count())
Using size():
Product Category
Computer 2
Electronics 2
Mobile Phone 2
dtype: int64
Using count() - excludes NaN values:
Quantity Product Name
Product Category
Computer 2 2
Electronics 2 2
Mobile Phone 2 2
Comparison
| Method | Includes NaN | Returns | Best For |
|---|---|---|---|
size() |
Yes | Series with group size | Total row count per group |
count() |
No | DataFrame with non-null counts | Valid values per column |
Conclusion
Use groupby().size() to count total rows in each group, including NaN values. Use count() when you need to exclude missing values from the count.
Advertisements
