Python - Removing Duplicate Dicts in List


Python is a very widely used platform for the purpose of web development, Data Science, Machine Learning and also to perform different processes with automation. We can store our data in python in different data types such as List, Dictionary, Data Sets. The data and information in python dictionary can be edited and changed as per our choice

The below article will provide information on different methods to remove duplicate dictionaries in a list. The option of directly selecting the duplicate dictionaries is not available and thus we will have to use different methods and features of python to remove the dictionaries.

Various Methods to Remove Duplicate Dictionaries

List Comprehension

As we cannot directly compare different dictionaries in the list, we will have to convert them into some other forms so that we can compare the different dictionaries present. We can understand it in a more better way through the following example:

Example

def all_duplicate(whole_dict):  
    same = set()   #We check all the dictionaries with the help of same set created
    return [dict(tuple(sorted(dupl.items()))) for dupl in whole_dict if tuple(sorted(dupl.items())) not in same and not same.add(tuple(sorted(dupl.items())))]  #We will convert each dictionary into tuple so that the dictionary having the same value will be removed and the duplicate dictionary can be found easily, if the tuple has a different value then the dictionary will be kept. 

# Example 
Whole_Dictionary = [
    {"Place": "Haldwani", "State": 'Uttrakhand'},
    {"Place": "Hisar", "State": 'Haryana'},
    {"Place": "Shillong", "State": 'Meghalaya'},
    {"Place": "Kochi", "State": 'Kerala'},
    {"Place": "Bhopal", "State": 'Madhya Pradesh'},
    {"Place": "Kochi", "State": 'Kerala'},   #This Dictionary is repeating which is to be removed
    {"Place": "Haridwar", "State": 'Uttarakhand'}
]

Final_Dict = all_duplicate(Whole_Dictionary)
print(Final_Dict)   #The output after removing the duplicate dictionary will be shown

Output

The output of the above example will be as follows:

[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}] 

Pandas Library

This method is used only in the case of a huge number of data set with many different elements, that is, only for dictionaries having complex data. We can understand the use of pandas library through the following example:

Example

import pandas as ps   #Do not forget to import pandas or error might occur
#Convert the dictionaries into panda frame

def all_duplicate(data):
    dd = ps.DataFrame(data)
    dd.drop_duplicates(inplace=True)   #Drop_duplicates() method will remove all the duplicate dictionaries
    return dd.to_dict(orient='records')  #Converting dictionaries back into list of dictionaries from panda frame

# Example 
Whole_Dictionary = [
    {"Place": "Haldwani", "State": 'Uttrakhand'},
    {"Place": "Hisar", "State": 'Haryana'},
    {"Place": "Shillong", "State": 'Meghalaya'},
    {"Place": "Kochi", "State": 'Kerala'},
    {"Place": "Bhopal", "State": 'Madhya Pradesh'},
    {"Place": "Kochi", "State": 'Kerala'},   #This Dictionary is repeating which is to be removed
    {"Place": "Haridwar", "State": 'Uttarakhand'}
]

Final_Dict = all_duplicate(Whole_Dictionary)
print(Final_Dict)   #The output after removing the duplicate dictionary will be shown

Output

[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}] 

Frozen Dictionary

Using the idea of a frozen dictionary is one technique to address the unhashability of dictionaries. A frozen dictionary can be used as a key in another dictionary or as an element in a set because it is essentially an immutable form of a dictionary. A convenient implementation of frozen dictionaries is offered by the frozendict library. We can understand it in a more better way through the following example:

Example

def make_hashable(d):
    return hash(frozenset(d.items())) # We will convert the dictionary key values into frozen set and then pass it to hash function

def all_duplicate(dicts):
    seen = set()  #It will check for similarities in the list
    return [d for d in dicts if not (make_hashable(d) in seen or seen.add(make_hashable(d)))] #If similarity will be found it will be removed and if not then the data will be kept

# Example 
Whole_Dictionary = [
    {"Place": "Haldwani", "State": 'Uttrakhand'},
    {"Place": "Hisar", "State": 'Haryana'},
    {"Place": "Shillong", "State": 'Meghalaya'},
    {"Place": "Kochi", "State": 'Kerala'},
    {"Place": "Bhopal", "State": 'Madhya Pradesh'},
    {"Place": "Kochi", "State": 'Kerala'},   #This Dictionary is repeating which is to be removed
    {"Place": "Haridwar", "State": 'Uttarakhand'}
]

Final_Dict = all_duplicate(Whole_Dictionary)
print(Final_Dict)   #The output after removing the duplicate dictionary will be shown

Output

[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}  

Helper Function

This is a complex method of removing duplicate dictionaries from the list of dictionaries. Through the use of a helper function, each dictionary is transformed into a sorted tuple of its contents in this procedure. The duplicate tuples are then found and removed from the list of dictionaries using this assistance function. We can understand it in better way through the following example:

Example

def sorted_dict_to_tuple(d):  # sorted_dicts_to_tuple takes the dictionary as input and sorts it into tuple
    return tuple(sorted(d.items()))

def all_duplicates(dicts):  # The all_duplicates function will check all the elements in the dictionary and keep track of any repeating element
    seen = set() 
    return [d for d in dicts if not (sorted_dict_to_tuple(d) in seen or seen.add(sorted_dict_to_tuple(d)))]

# Example 
Whole_Dictionary = [
    {"Place": "Haldwani", "State": 'Uttrakhand'},
    {"Place": "Hisar", "State": 'Haryana'},
    {"Place": "Shillong", "State": 'Meghalaya'},
    {"Place": "Kochi", "State": 'Kerala'},
    {"Place": "Bhopal", "State": 'Madhya Pradesh'},
    {"Place": "Kochi", "State": 'Kerala'},   #This Dictionary is repeating which is to be removed
    {"Place": "Haridwar", "State": 'Uttarakhand'}
]

Final_Dict = all_duplicates(Whole_Dictionary)
print(Final_Dict)   #The output after removing the duplicate dictionary will be shown

Output

[{'Place': 'Haldwani', 'State': 'Uttrakhand'}, {'Place': 'Hisar', 'State': 'Haryana'}, {'Place': 'Shillong', 'State': 'Meghalaya'}, {'Place': 'Kochi', 'State': 'Kerala'}, {'Place': 'Bhopal', 'State': 'Madhya Pradesh'}, {'Place': 'Haridwar', 'State': 'Uttarakhand'}]  

Conclusion

Following the right procedure is essential since removing duplicate dictionaries from a list is a time−consuming and difficult task. This article lists every approach that may be used to eliminate duplicate dictionaries from a list. One can use any method according to their convenience and field of application.

Updated on: 01-Aug-2023

158 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements