Group Records by Kth Column in a List using Python

In Python, grouping records by the Kth column in a list can be done using various methods like itertools.groupby(), dictionaries, and the pandas library. By grouping records by Kth column, we can analyze and manipulate data more effectively. In this article, we will explore these methods with practical examples.

Using itertools.groupby() Function

The itertools.groupby() function groups consecutive elements based on a key function. This method first sorts the records by the Kth column, then groups them together.

Syntax

itertools.groupby(iterable, key=None)

Parameters:

  • iterable: The input sequence or collection of elements to group

  • key: Optional function that specifies the grouping criterion. If None, elements themselves are used as keys

Example

Here's how to group records by the 2nd column (age) ?

import itertools

def group_by_kth_column(records, k):
    # Sort records by Kth column first
    sorted_records = sorted(records, key=lambda x: x[k-1])
    groups = []
    
    # Group consecutive records with same key
    for key, group in itertools.groupby(sorted_records, key=lambda x: x[k-1]):
        groups.append(list(group))
    return groups

# Sample data
records = [
    ['Alice', 25, 'Engineer'],
    ['Bob', 30, 'Manager'],
    ['Charlie', 25, 'Designer'],
    ['David', 30, 'Engineer'],
    ['Eve', 25, 'Manager']
]

grouped_records = group_by_kth_column(records, 2)

for group in grouped_records:
    print(group)

The output of the above code is ?

[['Alice', 25, 'Engineer'], ['Charlie', 25, 'Designer'], ['Eve', 25, 'Manager']]
[['Bob', 30, 'Manager'], ['David', 30, 'Engineer']]

Using Dictionary Approach

This approach uses a dictionary where keys are the Kth column values and values are lists of records. It's simpler and doesn't require pre-sorting.

Example

Group records using dictionary-based approach ?

def group_by_kth_column_dict(records, k):
    groups = {}
    
    for record in records:
        key = record[k-1]  # Get Kth column value
        
        if key in groups:
            groups[key].append(record)
        else:
            groups[key] = [record]
    
    return list(groups.values())

# Sample data
records = [
    ['Alice', 25, 'Engineer'],
    ['Bob', 30, 'Manager'],
    ['Charlie', 25, 'Designer'],
    ['David', 30, 'Engineer']
]

grouped_records = group_by_kth_column_dict(records, 2)

for group in grouped_records:
    print(group)

The output of the above code is ?

[['Alice', 25, 'Engineer'], ['Charlie', 25, 'Designer']]
[['Bob', 30, 'Manager'], ['David', 30, 'Engineer']]

Using Pandas Library

Pandas provides powerful data manipulation tools. Convert the list to a DataFrame and use groupby() for grouping.

Example

Group records using pandas DataFrame ?

import pandas as pd

def group_by_kth_column_pandas(records, k):
    # Convert to DataFrame
    df = pd.DataFrame(records, columns=['Name', 'Age', 'Job'])
    
    # Group by Kth column (k-1 for 0-based indexing)
    grouped = df.groupby(df.columns[k-1])
    
    # Convert groups to list format
    result = []
    for name, group in grouped:
        result.append(group.values.tolist())
    
    return result

# Sample data
records = [
    ['Alice', 25, 'Engineer'],
    ['Bob', 30, 'Manager'],
    ['Charlie', 25, 'Designer'],
    ['David', 30, 'Engineer']
]

grouped_records = group_by_kth_column_pandas(records, 2)

for group in grouped_records:
    print(group)

The output of the above code is ?

[['Alice', 25, 'Engineer'], ['Charlie', 25, 'Designer']]
[['Bob', 30, 'Manager'], ['David', 30, 'Engineer']]

Comparison

Method Requires Sorting Memory Usage Best For
itertools.groupby() Yes Low Simple lists
Dictionary No Medium Most use cases
Pandas No Higher Complex data analysis

Conclusion

Use the dictionary approach for most cases as it's simple and efficient. Choose itertools.groupby() for memory-constrained environments, and pandas for complex data analysis tasks.

Updated on: 2026-03-27T08:14:05+05:30

268 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements