Python - Multiple Keys Grouped Summation

Multiple keys grouped summation involves grouping data by multiple keys and calculating the sum of values for each group. This is commonly used in data analysis when you need to aggregate values based on multiple criteria.

Understanding the Problem

In multiple keys grouped summation, we have tuples where the first element is a value and the remaining elements form a composite key. Our task is to group tuples with the same composite key and sum their values.

For example, given data like (1000, 2022, 1), we treat (2022, 1) as the key and 1000 as the value to sum.

Using defaultdict for Grouped Summation

The most efficient approach uses defaultdict to automatically initialize groups and accumulate sums ?

from collections import defaultdict

# Sample data: (value, key1, key2)
data = [
    (1000, 2022, 1),
    (1500, 2022, 2), 
    (2000, 2022, 1),
    (500, 2023, 3),
    (800, 2023, 1),
    (1200, 2023, 1),
    (1500, 2023, 3)
]

print("Input data:", data)

# Group and sum by multiple keys
grouped_sum = defaultdict(int)

for item in data:
    value = item[0]
    key = item[1:3]  # Multiple keys as tuple
    grouped_sum[key] += value

# Convert back to list of tuples
result = [(key[0], key[1], total) for key, total in grouped_sum.items()]
print("Grouped summation:", result)
Input data: [(1000, 2022, 1), (1500, 2022, 2), (2000, 2022, 1), (500, 2023, 3), (800, 2023, 1), (1200, 2023, 1), (1500, 2023, 3)]
Grouped summation: [(2022, 1, 3000), (2022, 2, 1500), (2023, 3, 2000), (2023, 1, 2000)]

Alternative Approach Using Regular Dictionary

You can also implement this using a regular dictionary with explicit key checking ?

data = [
    (1000, 2022, 1),
    (2000, 2022, 1), 
    (1500, 2023, 3),
    (500, 2023, 3)
]

grouped_sum = {}

for item in data:
    value = item[0]
    key = item[1:3]
    
    if key in grouped_sum:
        grouped_sum[key] += value
    else:
        grouped_sum[key] = value

result = [(key[0], key[1], total) for key, total in grouped_sum.items()]
print("Result:", result)
Result: [(2022, 1, 3000), (2023, 3, 2000)]

Working with Different Data Structures

The same technique works with different tuple structures. Here's an example with three keys ?

from collections import defaultdict

# Data with three keys: (value, year, month, category)
sales_data = [
    (100, 2023, 1, 'A'),
    (200, 2023, 1, 'B'),
    (150, 2023, 1, 'A'),
    (300, 2023, 2, 'A'),
    (250, 2023, 2, 'B')
]

grouped = defaultdict(int)

for item in sales_data:
    value = item[0]
    key = item[1:]  # (year, month, category)
    grouped[key] += value

for key, total in grouped.items():
    year, month, category = key
    print(f"Year {year}, Month {month}, Category {category}: {total}")
Year 2023, Month 1, Category A: 250
Year 2023, Month 1, Category B: 200
Year 2023, Month 2, Category A: 300
Year 2023, Month 2, Category B: 250

Performance Analysis

Approach Time Complexity Space Complexity Best For
defaultdict O(n) O(k) Clean, readable code
Regular dict O(n) O(k) Explicit control

Where n is the number of input items and k is the number of unique key combinations.

Conclusion

Multiple keys grouped summation efficiently aggregates data using composite keys. The defaultdict approach provides clean, readable code with O(n) time complexity, making it ideal for data processing tasks.

---
Updated on: 2026-03-27T15:32:11+05:30

236 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements