Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python - Multiple Keys Grouped Summation
Multiple keys grouped summation involves grouping data by multiple keys and calculating the sum of values for each group. This is commonly used in data analysis when you need to aggregate values based on multiple criteria.
Understanding the Problem
In multiple keys grouped summation, we have tuples where the first element is a value and the remaining elements form a composite key. Our task is to group tuples with the same composite key and sum their values.
For example, given data like (1000, 2022, 1), we treat (2022, 1) as the key and 1000 as the value to sum.
Using defaultdict for Grouped Summation
The most efficient approach uses defaultdict to automatically initialize groups and accumulate sums ?
from collections import defaultdict
# Sample data: (value, key1, key2)
data = [
(1000, 2022, 1),
(1500, 2022, 2),
(2000, 2022, 1),
(500, 2023, 3),
(800, 2023, 1),
(1200, 2023, 1),
(1500, 2023, 3)
]
print("Input data:", data)
# Group and sum by multiple keys
grouped_sum = defaultdict(int)
for item in data:
value = item[0]
key = item[1:3] # Multiple keys as tuple
grouped_sum[key] += value
# Convert back to list of tuples
result = [(key[0], key[1], total) for key, total in grouped_sum.items()]
print("Grouped summation:", result)
Input data: [(1000, 2022, 1), (1500, 2022, 2), (2000, 2022, 1), (500, 2023, 3), (800, 2023, 1), (1200, 2023, 1), (1500, 2023, 3)] Grouped summation: [(2022, 1, 3000), (2022, 2, 1500), (2023, 3, 2000), (2023, 1, 2000)]
Alternative Approach Using Regular Dictionary
You can also implement this using a regular dictionary with explicit key checking ?
data = [
(1000, 2022, 1),
(2000, 2022, 1),
(1500, 2023, 3),
(500, 2023, 3)
]
grouped_sum = {}
for item in data:
value = item[0]
key = item[1:3]
if key in grouped_sum:
grouped_sum[key] += value
else:
grouped_sum[key] = value
result = [(key[0], key[1], total) for key, total in grouped_sum.items()]
print("Result:", result)
Result: [(2022, 1, 3000), (2023, 3, 2000)]
Working with Different Data Structures
The same technique works with different tuple structures. Here's an example with three keys ?
from collections import defaultdict
# Data with three keys: (value, year, month, category)
sales_data = [
(100, 2023, 1, 'A'),
(200, 2023, 1, 'B'),
(150, 2023, 1, 'A'),
(300, 2023, 2, 'A'),
(250, 2023, 2, 'B')
]
grouped = defaultdict(int)
for item in sales_data:
value = item[0]
key = item[1:] # (year, month, category)
grouped[key] += value
for key, total in grouped.items():
year, month, category = key
print(f"Year {year}, Month {month}, Category {category}: {total}")
Year 2023, Month 1, Category A: 250 Year 2023, Month 1, Category B: 200 Year 2023, Month 2, Category A: 300 Year 2023, Month 2, Category B: 250
Performance Analysis
| Approach | Time Complexity | Space Complexity | Best For |
|---|---|---|---|
| defaultdict | O(n) | O(k) | Clean, readable code |
| Regular dict | O(n) | O(k) | Explicit control |
Where n is the number of input items and k is the number of unique key combinations.
Conclusion
Multiple keys grouped summation efficiently aggregates data using composite keys. The defaultdict approach provides clean, readable code with O(n) time complexity, making it ideal for data processing tasks.
