Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python - Group Similar Keys in Dictionary
In Python, similar keys in a dictionary can be grouped using various methods such as defaultdict, dictionary of lists, and the itertools module with groupby() function. During data analysis, we often need to group similar keys together based on certain criteria.
Method 1: Using defaultdict
Python's defaultdict class from the collections module provides a convenient way to group similar keys. It automatically initializes a default value when a new key is accessed.
Syntax
from collections import defaultdict groups = defaultdict(list) groups[key].append(item)
Here, defaultdict(list) creates a dictionary that automatically creates an empty list for new keys. The groups[key].append(item) method appends items to the list associated with each key.
Example
In this example, we group tuples based on their first element using defaultdict
from collections import defaultdict
def group_keys_defaultdict(keys):
grouped_dict = defaultdict(list)
for key in keys:
grouped_dict[key[0]].append(key)
return dict(grouped_dict)
keys = [('A', 1), ('B', 2), ('A', 3), ('C', 4), ('B', 5)]
grouped_dict = group_keys_defaultdict(keys)
print(grouped_dict)
{'A': [('A', 1), ('A', 3)], 'B': [('B', 2), ('B', 5)], 'C': [('C', 4)]}
Method 2: Using Dictionary of Lists
We can manually create an empty dictionary and iterate over the keys to group them by creating lists for each unique key.
Example
In this approach, we check if a key exists in the dictionary before appending items
def group_keys_dict_of_lists(keys):
grouped_dict = {}
for key in keys:
if key[0] not in grouped_dict:
grouped_dict[key[0]] = []
grouped_dict[key[0]].append(key)
return grouped_dict
keys = [('A', 1), ('B', 2), ('A', 3), ('C', 4), ('B', 5)]
grouped_dict = group_keys_dict_of_lists(keys)
print(grouped_dict)
{'A': [('A', 1), ('A', 3)], 'B': [('B', 2), ('B', 5)], 'C': [('C', 4)]}
Method 3: Using itertools.groupby()
The groupby() function from the itertools module groups consecutive elements based on a key function. Important: The input must be sorted for groupby() to work correctly.
Syntax
itertools.groupby(iterable, key=None)
Here, iterable is any collection of elements and key is an optional function that determines the grouping criteria.
Example
First, we sort the keys, then use groupby() to group them
from itertools import groupby
def group_keys_itertools(keys):
grouped_dict = {}
keys.sort(key=lambda x: x[0])
for key, group in groupby(keys, lambda x: x[0]):
grouped_dict[key] = list(group)
return grouped_dict
keys = [('A', 1), ('B', 2), ('A', 3), ('C', 4), ('B', 5)]
grouped_dict = group_keys_itertools(keys)
print(grouped_dict)
{'A': [('A', 1), ('A', 3)], 'B': [('B', 2), ('B', 5)], 'C': [('C', 4)]}
Comparison
| Method | Pros | Cons |
|---|---|---|
defaultdict |
Clean, automatic initialization | Requires import |
| Dictionary of Lists | No imports needed, explicit | More verbose code |
groupby() |
Memory efficient for large datasets | Requires sorting first |
Conclusion
Use defaultdict for clean, readable code when grouping similar keys. Use dictionary of lists when you prefer explicit control without imports. Use itertools.groupby() for memory-efficient processing of large, sorted datasets.
