Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python Pandas – Count the Observations
In Pandas, you can count the observations (rows) within groups using the groupby() method combined with count(). This is useful for analyzing the frequency of categories in your data.
Creating a Sample DataFrame
Let's start by creating a DataFrame with product information ?
import pandas as pd
# Create a DataFrame with product data
dataFrame = pd.DataFrame({
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
print("Original DataFrame:")
print(dataFrame)
Original DataFrame: Product Name Product Category Quantity 0 Keyboard Computer 10 1 Charger Mobile Phone 50 2 SmartTV Electronics 10 3 Camera Electronics 20 4 Graphic Card Computer 25 5 Earphone Mobile Phone 50
Counting Observations by Group
Use groupby() to group data by category, then count() to get observation counts ?
import pandas as pd
# Create the DataFrame
dataFrame = pd.DataFrame({
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Group by Product Category and count observations
group = dataFrame.groupby("Product Category")
result = group.count()
print("Count of observations by Product Category:")
print(result)
Count of observations by Product Category:
Product Name Quantity
Product Category
Computer 2 2
Electronics 2 2
Mobile Phone 2 2
Understanding the Output
The result shows that each product category has 2 observations (rows). Both Product Name and Quantity columns show the same count because count() counts non-null values in each column.
Getting Count for Specific Columns
You can also count observations for specific columns only ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Count only Product Name column
group = dataFrame.groupby("Product Category")
result = group['Product Name'].count()
print("Count of Product Names by Category:")
print(result)
Count of Product Names by Category: Product Category Computer 2 Electronics 2 Mobile Phone 2 Name: Product Name, dtype: int64
Alternative: Using size()
The size() method counts all rows in each group, including those with null values ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Camera', 'Graphic Card', 'Earphone'],
'Product Category': ['Computer', 'Mobile Phone', 'Electronics', 'Electronics', 'Computer', 'Mobile Phone'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Using size() to count all rows per group
group_size = dataFrame.groupby("Product Category").size()
print("Group sizes using size():")
print(group_size)
Group sizes using size(): Product Category Computer 2 Electronics 2 Mobile Phone 2 dtype: int64
Conclusion
Use groupby().count() to count non-null observations in each group. For counting all rows including nulls, use size() instead. This technique is essential for data analysis and understanding category distributions.
