Article Categories

Selected Reading

Sort tuple based on occurrence of first element in Python

Python Server Side Programming Programming

When you need to sort tuples based on the occurrence frequency of their first element, Python provides several approaches. This technique is useful for grouping and counting elements in data analysis tasks.

Understanding the Problem

Given a list of tuples, we want to group them by the first element and count how many times each first element appears. The result includes the first element, associated values, and occurrence count.

Using Dictionary with setdefault()

The setdefault() method creates a dictionary entry if the key doesn't exist, making it perfect for grouping ?

def sort_on_occurrence(my_list):
    my_dict = {}
    for i, j in my_list:
        my_dict.setdefault(i, []).append(j)
    return [(i, *dict.fromkeys(j), len(j)) for i, j in my_dict.items()]

my_list = [(1, 'Harold'), (12, 'Jane'), (4, 'Paul'), (7, 'Will'), (1, 'Bob'), (4, 'Alice')]
print("The list of tuples is:")
print(my_list)
print("The list after sorting by occurrence is:")
result = sort_on_occurrence(my_list)
print(result)

The list of tuples is:
[(1, 'Harold'), (12, 'Jane'), (4, 'Paul'), (7, 'Will'), (1, 'Bob'), (4, 'Alice')]
The list after sorting by occurrence is:
[(1, 'Harold', 'Bob', 2), (12, 'Jane', 1), (4, 'Paul', 'Alice', 2), (7, 'Will', 1)]

Using Counter for Frequency Count

The Counter class provides a cleaner approach for counting occurrences ?

from collections import Counter, defaultdict

def count_first_elements(tuple_list):
    # Group by first element
    groups = defaultdict(list)
    for first, second in tuple_list:
        groups[first].append(second)
    
    # Create result with counts
    result = []
    for key, values in groups.items():
        result.append((key, *values, len(values)))
    
    return result

my_list = [(3, 'A'), (1, 'B'), (3, 'C'), (2, 'D'), (1, 'E'), (3, 'F')]
print("Original list:")
print(my_list)
print("Grouped with occurrence count:")
result = count_first_elements(my_list)
print(result)

Original list:
[(3, 'A'), (1, 'B'), (3, 'C'), (2, 'D'), (1, 'E'), (3, 'F')]
Grouped with occurrence count:
[(3, 'A', 'C', 'F', 3), (1, 'B', 'E', 2), (2, 'D', 1)]

Sorting by Occurrence Frequency

You can sort the results by occurrence count in descending order ?

def sort_by_frequency(tuple_list):
    groups = {}
    for first, second in tuple_list:
        groups.setdefault(first, []).append(second)
    
    # Create tuples with count and sort by count (descending)
    result = [(key, *values, len(values)) for key, values in groups.items()]
    return sorted(result, key=lambda x: x[-1], reverse=True)

my_list = [(5, 'X'), (3, 'Y'), (5, 'Z'), (3, 'W'), (5, 'V'), (1, 'U')]
print("Original list:")
print(my_list)
print("Sorted by occurrence frequency:")
result = sort_by_frequency(my_list)
print(result)

Original list:
[(5, 'X'), (3, 'Y'), (5, 'Z'), (3, 'W'), (5, 'V'), (1, 'U')]
Sorted by occurrence frequency:
[(5, 'X', 'Z', 'V', 3), (3, 'Y', 'W', 2), (1, 'U', 1)]

Comparison of Methods

Method	Complexity	Best For
`setdefault()`	O(n)	Simple grouping
`defaultdict()`	O(n)	Cleaner syntax
`Counter`	O(n)	When you only need counts

Conclusion

Use setdefault() for basic grouping or defaultdict() for cleaner code. The dict.fromkeys() method helps remove duplicate values while preserving order. Sort the final result by occurrence count for frequency-based analysis.

AmitDiwan

Updated on: 2026-03-25T17:41:47+05:30

294 Views

Previous Next