Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Search and Replace text in Python?
Python provides several methods to search for and replace text patterns in strings. For simple literal patterns, use str.replace(). For complex patterns, use the re module with regular expressions.
Basic Text Searching
Let's start by creating a sample text and exploring basic search methods ?
def sample():
yield 'Is'
yield 'USA'
yield 'Colder'
yield 'Than'
yield 'Canada?'
text = ' '.join(sample())
print(f"Output \n {text}")
Output Is USA Colder Than Canada?
Using String Methods for Search
Python provides built-in string methods for basic text searching ?
text = "Is USA Colder Than Canada?"
# Check if text starts with specific string
print(f"Starts with 'Is': {text.startswith('Is')}")
# Check if text ends with specific string
print(f"Ends with 'Canada?': {text.endswith('Canada?')}")
# Find position of substring
print(f"Position of 'USA': {text.find('USA')}")
# Check exact match
print(f"Exact match with 'USA': {text == 'USA'}")
Starts with 'Is': True Ends with 'Canada?': True Position of 'USA': 3 Exact match with 'USA': False
Simple Text Replacement
For simple literal text replacement, use the str.replace() method ?
text = "Is USA Colder Than Canada?"
replaced_text = text.replace('USA', 'Australia')
print(f"Original: {text}")
print(f"Replaced: {replaced_text}")
Original: Is USA Colder Than Canada? Replaced: Is Australia Colder Than Canada?
Advanced Pattern Matching with Regular Expressions
For complex patterns, use the re module. Here's an example matching date patterns ?
import re
# Create a date string
date1 = '22/10/2020'
# Check if text matches date pattern (dd/mm/yyyy)
if re.match(r'\d+/\d+/\d+', date1):
print('Valid date format')
else:
print('Invalid date format')
Valid date format
Advanced Text Replacement with re.sub()
Use re.sub() for pattern-based replacements. The example below converts dates from dd/mm/yyyy to yyyy-dd-mm format ?
import re
sentence = 'Date is 22/11/2020. Tomorrow is 23/11/2020.'
# Use capture groups: (\d+) captures digits
# \1, \2, \3 refer to first, second, third capture groups
replaced_text = re.sub(r'(\d+)/(\d+)/(\d+)', r'\3-\1-\2', sentence)
print(f"Original: {sentence}")
print(f"Replaced: {replaced_text}")
Original: Date is 22/11/2020. Tomorrow is 23/11/2020. Replaced: Date is 2020-22-11. Tomorrow is 2020-23-11.
Compiling Patterns for Better Performance
For repeated use, compile the regex pattern first to improve performance ?
import re
sentence = 'Date is 22/11/2020. Tomorrow is 23/11/2020.'
pattern = re.compile(r'(\d+)/(\d+)/(\d+)')
replaced_pattern = pattern.sub(r'\3-\1-\2', sentence)
print(f"Compiled pattern result: {replaced_pattern}")
Compiled pattern result: Date is 2020-22-11. Tomorrow is 2020-23-11.
Counting Substitutions with re.subn()
Use re.subn() to get both the replaced text and the number of substitutions made ?
import re
sentence = 'Date is 22/11/2020. Tomorrow is 23/11/2020.'
pattern = re.compile(r'(\d+)/(\d+)/(\d+)')
output, count = pattern.subn(r'\3-\1-\2', sentence)
print(f"Replaced text: {output}")
print(f"Number of substitutions: {count}")
Replaced text: Date is 2020-22-11. Tomorrow is 2020-23-11. Number of substitutions: 2
Comparison
| Method | Use Case | Performance |
|---|---|---|
str.replace() |
Simple literal replacement | Fast |
re.sub() |
Pattern-based replacement | Slower but flexible |
re.compile().sub() |
Repeated pattern operations | Better for multiple uses |
Conclusion
Use str.replace() for simple text replacements and re.sub() for complex pattern matching. Compile regex patterns when using them multiple times for better performance.
