Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Remove all duplicates from a given string in Python
To remove all duplicates from a string in Python, we need to first split the string by spaces so that we have each word in an array. Then there are multiple ways to remove duplicates from the resulting list.
We can remove duplicates by first converting all words to lowercase, then sorting them and finally picking only the unique ones. Let's explore different approaches ?
Using Manual Loop with Sorting
This approach converts words to lowercase, sorts them, and uses a loop to extract unique words ?
sent = "Hi my name is John Doe John Doe is my name"
# Separate out each word
words = sent.split(" ")
# Convert all words to lowercase
words = [word.lower() for word in words]
# Sort the words in order
words.sort()
unique = []
total_words = len(words)
i = 0
while i < total_words:
if i == total_words - 1 or words[i] != words[i + 1]:
unique.append(words[i])
i += 1
print(unique)
['doe', 'hi', 'is', 'john', 'my', 'name']
Using Set for Unique Words
A simpler approach using Python's built-in set() to automatically remove duplicates ?
sent = "Hi my name is John Doe John Doe is my name" # Split and convert to lowercase words = [word.lower() for word in sent.split()] # Use set to remove duplicates and convert back to sorted list unique = sorted(list(set(words))) print(unique)
['doe', 'hi', 'is', 'john', 'my', 'name']
Preserving Original Order
To maintain the order of first occurrence of each word ?
sent = "Hi my name is John Doe John Doe is my name"
words = sent.split()
seen = set()
unique = []
for word in words:
word_lower = word.lower()
if word_lower not in seen:
seen.add(word_lower)
unique.append(word_lower)
print(unique)
['hi', 'my', 'name', 'is', 'john', 'doe']
Comparison
| Method | Preserves Order? | Time Complexity | Best For |
|---|---|---|---|
| Manual Loop | No (sorted) | O(n log n) | Learning algorithm logic |
| Set Method | No (sorted) | O(n log n) | Simple and readable |
| Order Preserving | Yes | O(n) | Maintaining word sequence |
Conclusion
Use set() for the simplest approach to remove duplicates. Use the order-preserving method when the sequence of words matters. The manual loop approach helps understand the underlying logic.
