Program to remove duplicate characters from a given string in Python

When working with strings in Python, we often need to remove duplicate characters while preserving the order of first occurrence. This is useful in data cleaning, text processing, and algorithm problems.

We can solve this using an ordered dictionary to maintain the insertion order of characters. The dictionary tracks which characters we've seen, and we can join the keys to get our result string.

So, if the input is like s = "bbabcaaccdbaabababc", then the output will be "bacd".

Algorithm

  • Create an ordered dictionary to store characters in insertion order
  • For each character c in the string:
    • If c is not present in dictionary, add it with initial count 0
    • Increment the count for character c
  • Join the dictionary keys in order to form the result string

Method 1: Using OrderedDict

The OrderedDict from collections module maintains insertion order ?

from collections import OrderedDict

def remove_duplicates(s):
    d = OrderedDict()
    for c in s:
        if c not in d:
            d[c] = 0
        d[c] += 1
    
    return ''.join(d.keys())

s = "bbabcaaccdbaabababc"
result = remove_duplicates(s)
print(f"Original: {s}")
print(f"Result: {result}")
Original: bbabcaaccdbaabababc
Result: bacd

Method 2: Using Regular Dictionary (Python 3.7+)

Since Python 3.7, regular dictionaries maintain insertion order ?

def remove_duplicates_dict(s):
    seen = {}
    for c in s:
        seen[c] = True
    
    return ''.join(seen.keys())

s = "bbabcaaccdbaabababc"
result = remove_duplicates_dict(s)
print(f"Original: {s}")
print(f"Result: {result}")
Original: bbabcaaccdbaabababc
Result: bacd

Method 3: Using Set for Tracking

A more memory-efficient approach using a set to track seen characters ?

def remove_duplicates_set(s):
    seen = set()
    result = []
    
    for c in s:
        if c not in seen:
            seen.add(c)
            result.append(c)
    
    return ''.join(result)

s = "bbabcaaccdbaabababc"
result = remove_duplicates_set(s)
print(f"Original: {s}")
print(f"Result: {result}")
Original: bbabcaaccdbaabababc
Result: bacd

Comparison

Method Memory Usage Python Version Best For
OrderedDict Higher All versions Explicit ordering guarantee
Regular Dict Medium 3.7+ Simple and clean code
Set + List Lower All versions Memory efficiency

Conclusion

Use regular dictionaries for Python 3.7+ projects, OrderedDict for older versions, and the set-based approach when memory efficiency is critical. All methods preserve the first occurrence order of characters.

Updated on: 2026-03-26T15:46:06+05:30

8K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements