Article Categories

Selected Reading

Regular Expression Examples in Python

Python Server Side Programming Programming

Regular expressions (regex) are powerful tools for pattern matching and text processing in Python. The re module provides functions to work with regular expressions, allowing you to search, match, and manipulate text based on specific patterns.

Literal Characters

Literal characters in regex match themselves exactly. Here's how to use basic literal matching ?

import re

text = "python is awesome"
pattern = "python"

match = re.search(pattern, text)
if match:
    print(f"Found: '{match.group()}'")
else:
    print("Not found")

Found: 'python'

Character Classes

Character classes allow you to match any character from a specified set. Use square brackets to define a character class ?

import re

# Match Python or python
text1 = "Python programming"
text2 = "python programming"
pattern = "[Pp]ython"

for text in [text1, text2]:
    if re.search(pattern, text):
        print(f"Match found in: {text}")

Match found in: Python programming
Match found in: python programming

Common Character Classes

Pattern	Description	Example
`[aeiou]`	Match any lowercase vowel	Matches 'a', 'e', 'i', 'o', 'u'
`[0-9]`	Match any digit	Same as [0123456789]
`[a-z]`	Match any lowercase letter	Matches 'a' through 'z'
`[^aeiou]`	Match anything except vowels	Negated character class

Special Character Classes

Python regex provides shorthand notations for common character classes ?

import re

text = "Phone: 123-456-7890"

# \d matches digits
digits = re.findall(r'\d', text)
print(f"Digits found: {digits}")

# \w matches word characters
words = re.findall(r'\w+', text)
print(f"Words found: {words}")

# \s matches whitespace
spaces = re.findall(r'\s', text)
print(f"Spaces found: {len(spaces)} space(s)")

Digits found: ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']
Words found: ['Phone', '123', '456', '7890']
Spaces found: 1 space(s)

Quantifiers and Repetition

Quantifiers specify how many times a character or group should be matched ?

import re

texts = ["rub", "ruby", "rubyyy", "123", "12345"]

patterns = {
    r'ruby?': "Match 'rub' or 'ruby' (y is optional)",
    r'ruby*': "Match 'rub' plus 0 or more y's",
    r'ruby+': "Match 'rub' plus 1 or more y's",
    r'\d{3}': "Match exactly 3 digits",
    r'\d{3,5}': "Match 3 to 5 digits"
}

for pattern, description in patterns.items():
    print(f"\nPattern: {pattern} - {description}")
    for text in texts:
        if re.fullmatch(pattern, text):
            print(f"  ? '{text}' matches")
        else:
            print(f"  ? '{text}' doesn't match")

Pattern: ruby? - Match 'rub' or 'ruby' (y is optional)
  ? 'rub' matches
  ? 'ruby' matches
  ? 'rubyyy' doesn't match
  ? '123' doesn't match
  ? '12345' doesn't match

Pattern: ruby* - Match 'rub' plus 0 or more y's
  ? 'rub' matches
  ? 'ruby' matches
  ? 'rubyyy' matches
  ? '123' doesn't match
  ? '12345' doesn't match

Pattern: ruby+ - Match 'rub' plus 1 or more y's
  ? 'rub' doesn't match
  ? 'ruby' matches
  ? 'rubyyy' matches
  ? '123' doesn't match
  ? '12345' doesn't match

Pattern: \d{3} - Match exactly 3 digits
  ? 'rub' doesn't match
  ? 'ruby' doesn't match
  ? 'rubyyy' doesn't match
  ? '123' matches
  ? '12345' doesn't match

Pattern: \d{3,5} - Match 3 to 5 digits
  ? 'rub' doesn't match
  ? 'ruby' doesn't match
  ? 'rubyyy' doesn't match
  ? '123' matches
  ? '12345' matches

Greedy vs Non-greedy Matching

By default, quantifiers are greedy and match as much as possible. Add ? to make them non-greedy ?

import re

text = "<python>perl>"

# Greedy matching
greedy = re.search(r'<.*>', text)
print(f"Greedy match: {greedy.group()}")

# Non-greedy matching
non_greedy = re.search(r'<.*?>', text)
print(f"Non-greedy match: {non_greedy.group()}")

Greedy match: <python>perl>
Non-greedy match: <python>

Grouping and Alternatives

Use parentheses to group patterns and the pipe | symbol for alternatives ?

import re

texts = ["python", "perl", "ruby", "ruble"]

# Alternative matching
pattern = r'python|perl'
print("Matching 'python' or 'perl':")
for text in texts:
    if re.search(pattern, text):
        print(f"  ? '{text}' matches")

# Grouping example
pattern2 = r'rub(y|le)'
print("\nMatching 'ruby' or 'ruble':")
for text in texts:
    if re.fullmatch(pattern2, text):
        print(f"  ? '{text}' matches")

Matching 'python' or 'perl':
  ? 'python' matches
  ? 'perl' matches

Matching 'ruby' or 'ruble':
  ? 'ruby' matches
  ? 'ruble' matches

Anchors and Boundaries

Anchors specify where in the text the pattern should match ?

import re

texts = ["Python is great", "I love Python", "Python"]

# Start of string
start_pattern = r'^Python'
print("Matches starting with 'Python':")
for text in texts:
    if re.search(start_pattern, text):
        print(f"  ? '{text}'")

# End of string
end_pattern = r'Python$'
print("\nMatches ending with 'Python':")
for text in texts:
    if re.search(end_pattern, text):
        print(f"  ? '{text}'")

# Word boundary
boundary_text = "Python programming in python"
word_pattern = r'\bpython\b'
matches = re.findall(word_pattern, boundary_text, re.IGNORECASE)
print(f"\nWord boundary matches in '{boundary_text}': {matches}")

Matches starting with 'Python':
  ? 'Python is great'
  ? 'Python'

Matches ending with 'Python':
  ? 'I love Python'
  ? 'Python'

Word boundary matches in 'Python programming in python': ['Python', 'python']

Conclusion

Regular expressions provide powerful pattern matching capabilities in Python. Master character classes, quantifiers, and anchors to efficiently search and manipulate text. Use the re module's functions like search(), findall(), and match() for different matching needs.

---

Mohd Mohtashim

Updated on: 2026-03-25T07:49:35+05:30

414 Views

Previous Next