Python - Count distinct in Pandas Aggregation with Numpy

To count distinct values in Pandas aggregation, use nunique() method. This tutorial shows how to group by a column, calculate sums using NumPy, and count distinct values simultaneously.

Required Libraries

Import the necessary libraries for data manipulation ?

import pandas as pd
import numpy as np

Creating Sample DataFrame

Create a DataFrame with duplicate values to demonstrate distinct counting ?

import pandas as pd
import numpy as np

dataFrame = pd.DataFrame({
    "Car": ['BMW', 'Audi', 'BMW', 'Lexus', 'Lexus'],
    "Place": ['Delhi', 'Bangalore', 'Delhi', 'Chandigarh', 'Chandigarh'],
    "Units": [100, 150, 50, 110, 90]
})

print("Original DataFrame:")
print(dataFrame)
Original DataFrame:
     Car       Place  Units
0    BMW       Delhi    100
1   Audi   Bangalore    150
2    BMW       Delhi     50
3  Lexus  Chandigarh    110
4  Lexus  Chandigarh     90

Count Distinct with Aggregation

Use groupby() with agg() to apply multiple aggregation functions. Here we sum Units and count distinct Places ?

import pandas as pd
import numpy as np

dataFrame = pd.DataFrame({
    "Car": ['BMW', 'Audi', 'BMW', 'Lexus', 'Lexus'],
    "Place": ['Delhi', 'Bangalore', 'Delhi', 'Chandigarh', 'Chandigarh'],
    "Units": [100, 150, 50, 110, 90]
})

# Group by Car and apply different aggregations
result = dataFrame.groupby("Car").agg({
    "Units": np.sum,           # Sum of units
    "Place": pd.Series.nunique # Count distinct places
})

print("Aggregated DataFrame:")
print(result)
Aggregated DataFrame:
       Units  Place
Car                
Audi     150      1
BMW      150      1
Lexus    200      1

Alternative Approach Using nunique()

You can also use the direct nunique() method for cleaner syntax ?

import pandas as pd
import numpy as np

dataFrame = pd.DataFrame({
    "Car": ['BMW', 'Audi', 'BMW', 'Lexus', 'Toyota', 'Toyota'],
    "Place": ['Delhi', 'Bangalore', 'Mumbai', 'Chandigarh', 'Delhi', 'Mumbai'],
    "Units": [100, 150, 50, 110, 90, 80]
})

# Alternative syntax using nunique directly
result = dataFrame.groupby("Car").agg({
    "Units": "sum",
    "Place": "nunique"
})

print("Result with multiple distinct places:")
print(result)
Result with multiple distinct places:
        Units  Place
Car                 
Audi      150      1
BMW       150      2
Lexus     110      1
Toyota    170      2

Key Points

Method Syntax Use Case
pd.Series.nunique {"col": pd.Series.nunique} Explicit method reference
"nunique" {"col": "nunique"} String shorthand (cleaner)
np.sum {"col": np.sum} NumPy sum function

Conclusion

Use nunique() in Pandas aggregation to count distinct values efficiently. Combine it with other functions like np.sum() to perform multiple aggregations in a single operation.

Updated on: 2026-03-26T02:16:52+05:30

816 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements