Duplicate substring removal from list in Python

Sometimes we may have a need to refine a given list by eliminating duplicate substrings from delimited strings. This can be achieved by using a combination of various methods available in Python's standard library like set(), split(), and list comprehensions.

Using set() and split()

The split() method segregates the elements for duplicate checking, and the set() method stores only unique elements from each string. This approach maintains grouping by original string ?

Example

# initializing list
strings = ['xy-xy', 'pq-qr', 'xp-xp-xp', 'dd-ee']

print("Given list:", strings)

# using set() and split()
result = [set(sub.split('-')) for sub in strings]

print("List after duplicate removal:", result)

The output of the above code is ?

Given list: ['xy-xy', 'pq-qr', 'xp-xp-xp', 'dd-ee']
List after duplicate removal: [{'xy'}, {'pq', 'qr'}, {'xp'}, {'ee', 'dd'}]

Using List Comprehension with Set

We can also flatten all substrings into a single list and remove duplicates globally using nested list comprehension with set() ?

Example

# initializing list
strings = ['xy-xy', 'pq-qr', 'xp-xp-xp', 'dd-ee']

print("Given list:", strings)

# using list comprehension with set
result = list({substring for string in strings for substring in string.split('-')})

print("List after duplicate removal:", result)

The output of the above code is ?

Given list: ['xy-xy', 'pq-qr', 'xp-xp-xp', 'dd-ee']
List after duplicate removal: ['dd', 'pq', 'ee', 'xp', 'xy', 'qr']

Comparison

Method Output Type Preserves Groups Best For
set() per string List of sets Yes Maintaining original string grouping
Flattened set Single list No Getting all unique substrings

Conclusion

Use set() with list comprehension to remove duplicates per string while maintaining groups. Use flattened comprehension with set() to get all unique substrings in a single list.

Updated on: 2026-03-15T18:23:04+05:30

438 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements