Article Categories

Selected Reading

Program to remove duplicate characters from a given string in Python

Python Server Side Programming Programming

When working with strings in Python, we often need to remove duplicate characters while preserving the order of first occurrence. This is useful in data cleaning, text processing, and algorithm problems.

We can solve this using an ordered dictionary to maintain the insertion order of characters. The dictionary tracks which characters we've seen, and we can join the keys to get our result string.

So, if the input is like s = "bbabcaaccdbaabababc", then the output will be "bacd".

Algorithm

Create an ordered dictionary to store characters in insertion order
For each character c in the string:
- If c is not present in dictionary, add it with initial count 0
- Increment the count for character c
Join the dictionary keys in order to form the result string

Method 1: Using OrderedDict

The OrderedDict from collections module maintains insertion order ?

from collections import OrderedDict

def remove_duplicates(s):
    d = OrderedDict()
    for c in s:
        if c not in d:
            d[c] = 0
        d[c] += 1
    
    return ''.join(d.keys())

s = "bbabcaaccdbaabababc"
result = remove_duplicates(s)
print(f"Original: {s}")
print(f"Result: {result}")

Original: bbabcaaccdbaabababc
Result: bacd

Method 2: Using Regular Dictionary (Python 3.7+)

Since Python 3.7, regular dictionaries maintain insertion order ?

def remove_duplicates_dict(s):
    seen = {}
    for c in s:
        seen[c] = True
    
    return ''.join(seen.keys())

s = "bbabcaaccdbaabababc"
result = remove_duplicates_dict(s)
print(f"Original: {s}")
print(f"Result: {result}")

Original: bbabcaaccdbaabababc
Result: bacd

Method 3: Using Set for Tracking

A more memory-efficient approach using a set to track seen characters ?

def remove_duplicates_set(s):
    seen = set()
    result = []
    
    for c in s:
        if c not in seen:
            seen.add(c)
            result.append(c)
    
    return ''.join(result)

s = "bbabcaaccdbaabababc"
result = remove_duplicates_set(s)
print(f"Original: {s}")
print(f"Result: {result}")

Original: bbabcaaccdbaabababc
Result: bacd

Comparison

Method	Memory Usage	Python Version	Best For
OrderedDict	Higher	All versions	Explicit ordering guarantee
Regular Dict	Medium	3.7+	Simple and clean code
Set + List	Lower	All versions	Memory efficiency

Conclusion

Use regular dictionaries for Python 3.7+ projects, OrderedDict for older versions, and the set-based approach when memory efficiency is critical. All methods preserve the first occurrence order of characters.

Arnab Chakraborty

Updated on: 2026-03-26T15:46:06+05:30

9K+ Views

Previous Next