How to Count Unique Values in a Pandas Groupby Object?


In data analysis, it's often necessary to count the number of unique values in a pandas Groupby object. Pandas Groupby object is a powerful tool for grouping data based on one or more columns and performing aggregate functions on each group. By counting the number of unique values in a Groupby object, we can gain insights into the diversity and distribution of the data within each group.

To count unique values in a pandas Groupby object, we need to use the nunique() method. This method returns the number of unique values in each group of the Groupby object. We can apply this method to a specific column of the Groupby object or to the entire object.

In addition to the nunique() method, we can also use the agg() method to count unique values in a pandas Groupby object. The agg() method allows us to apply multiple aggregation functions to a Groupby object at once, including nunique().

Now let's make use of the different approaches that are available to us with the help of examples.

Using the nunique() method

The simplest way to count unique values in a pandas Groupby object is to use the nunique() method. This method returns the number of unique values in each group of the Groupby object.

Consider the code shown below.

Example

import pandas as pd

# Load sample data
df = pd.read_csv('data.csv')

# Group data by column 'A' and count unique values in column 'B'
unique_count = df.groupby('A')['B'].nunique()

# Print the result
print(unique_count)

Explanation

In this example, we load a sample dataset and group the data by column 'A'. We then count the number of unique values in column 'B' for each group using the nunique() method. The result is a pandas Series object that shows the number of unique values in column 'B' for each group.

Output

A
1	2
2	1
3	3
Name: B, dtype: int64

Using the agg() method

We can also use the agg() method to count unique values in a pandas Groupby object. This method allows us to apply multiple aggregation functions, including nunique(), to a Groupby object.

Consider the code shown below.

Example

import pandas as pd

# Load sample data
df = pd.read_csv('data.csv')

# Group data by columns 'A' and 'C', and count unique values in column 'B'
unique_count = df.groupby(['A', 'C']).agg({'B': 'nunique'})

# Print the result
print(unique_count)

Explanation

In this example, we group the data by columns 'A' and 'C', and count the number of unique values in column 'B' using the nunique() method. We use the agg() method to apply the nunique() method to column 'B', and pass a dictionary to specify the column(s) to group by and the aggregation function(s) to apply.

The result is a pandas DataFrame object that shows the number of unique values in column 'B' for each combination of values in columns 'A' and 'C'.

Output

     B
A  C   
1   X  1
    Y  1
2   X  1
3   X  2
    Y  1

Using the unique() method and len() function

Another approach to counting unique values in a pandas Groupby object is to use the unique() method to extract unique values and the len() function to count them.

Consider the code shown below.

Example

import pandas as pd

# Load sample data
df = pd.read_csv('data.csv')

# Group data by column 'A' and extract unique values in column 'B'
unique_values = df.groupby('A')['B'].unique()

# Count the number of unique values in each group
unique_count = unique_values.apply(lambda x: len(x))

# Print the result
print(unique_count)

Explanation

In this example, we group the data by column 'A' and extract the unique values in column 'B' using the unique() method. We then count the number of unique values in each group using the len() function and the apply() method. The result is a pandas Series object that shows the number of unique values in column 'B' for each group.

Output

A
1	2
2	1
3	3
Name: B, dtype: int64

Conclusion

In conclusion, counting unique values in a pandas Groupby object is a common task in data analysis and can be achieved using different approaches.

The nunique() method is a simple way to count unique values in a Groupby object, while the agg() method allows us to apply multiple aggregation functions, including nunique(), to a Groupby object.

Another approach is to use the unique() method to extract unique values and the len() function to count them. Depending on the specific use case, one approach may be more appropriate than another.

By understanding these different approaches, we can efficiently count unique values in a pandas Groupby object and gain valuable insights into our data.

Updated on: 03-Aug-2023

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements