Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to write a Python regular expression to match multiple words anywhere?
Regular expressions provide a powerful way to match multiple specific words anywhere within a string. Python's re module offers several approaches to accomplish this task efficiently.
Using Word Boundaries with OR Operator
The most common approach uses word boundaries (\b) combined with the OR operator (|) to match complete words ?
import re s = "These are roses and lilies and orchids, but not marigolds or daisies" r = re.compile(r'\broses\b|\bmarigolds\b|\borchids\b', flags=re.I | re.X) print(r.findall(s))
['roses', 'orchids', 'marigolds']
Using Alternation Groups
You can simplify the pattern by grouping alternatives within parentheses ?
import re
s = "These are roses and lilies and orchids, but not marigolds or daisies"
words = ['roses', 'marigolds', 'orchids', 'tulips']
pattern = r'\b(' + '|'.join(words) + r')\b'
matches = re.findall(pattern, s, re.IGNORECASE)
print(matches)
['roses', 'orchids', 'marigolds']
Dynamic Pattern Building
For larger word lists, dynamically build the pattern from a list ?
import re
def find_multiple_words(text, word_list):
pattern = r'\b(?:' + '|'.join(re.escape(word) for word in word_list) + r')\b'
return re.findall(pattern, text, re.IGNORECASE)
text = "I love roses, orchids, and marigolds but not dandelions"
flowers = ['roses', 'orchids', 'marigolds', 'tulips', 'daisies']
result = find_multiple_words(text, flowers)
print(result)
['roses', 'orchids', 'marigolds']
Key Components
| Component | Purpose | Example |
|---|---|---|
\b |
Word boundary | Matches complete words only |
| |
OR operator | Alternation between patterns |
re.I |
Case insensitive | Matches regardless of case |
re.escape() |
Escape special chars | Treats special regex chars literally |
Conclusion
Use word boundaries with the OR operator for simple cases, or build dynamic patterns for flexible word matching. The re.escape() function ensures special characters in your word list are treated literally.
Advertisements
