Pattern matching in Python with Regex

Regular expressions (regex) are a powerful tool for pattern matching and string manipulation in Python. The re module provides comprehensive regex functionality for finding, matching, and replacing text patterns.

What is Regular Expression?

A regular expression is a sequence of characters that defines a search pattern. In Python, the re module handles string parsing and pattern matching. Regular expressions can answer questions like ?

  • Is this string a valid URL?

  • Which users in /etc/passwd are in a given group?

  • What is the date and time of all warning messages in a log file?

  • What username and document were requested by the URL a visitor typed?

A typical regular expression search follows this pattern ?

import re

match = re.search(pattern, string)

Basic Pattern Matching

Let's start with a simple example using literal characters ?

import re

search_string = "TutorialsPoint"
pattern = "Tutorials"
match = re.match(pattern, search_string)

if match:
    print("regex matches:", match.group())
else:
    print('pattern not found')
regex matches: Tutorials

Using re.search() for Pattern Matching

The re.search() method finds the first occurrence of a pattern anywhere in the string ?

Syntax

matchObject = re.search(pattern, input_string, flags=0)

Example with Groups

import re

# Regular expression to match a date string
regex = r"([a-zA-Z]+) (\d+)"
text = "Jan 2"

if re.search(regex, text):
    match = re.search(regex, text)
    
    # Match position
    print("Match at index %s, %s" % (match.start(), match.end()))
    
    # Full match and groups
    print("Full match: %s" % (match.group(0)))
    print("Month: %s" % (match.group(1)))
    print("Day: %s" % (match.group(2)))
else:
    print("Pattern not Found!")
Match at index 0, 5
Full match: Jan 2
Month: Jan
Day: 2

Capturing Groups with findall()

When patterns include parentheses, findall() returns tuples containing captured groups ?

import re

regex = r'([\w\.-]+)@([\w\.-]+)'
text = 'hello john@hotmail.com, hello@Tutorialspoint.com, hello python@gmail.com'
matches = re.findall(regex, text)

print("All matches:", matches)

for username, host in matches:
    print("Username:", username)
    print("Host:", host)
    print("---")
All matches: [('john', 'hotmail.com'), ('hello', 'Tutorialspoint.com'), ('python', 'gmail.com')]
Username: john
Host: hotmail.com
---
Username: hello
Host: Tutorialspoint.com
---
Username: python
Host: gmail.com
---

Finding and Replacing with re.sub()

Use re.sub() to find patterns and replace them with new text ?

import re

text = 'hello john@hotmail.com, hello@Tutorialspoint.com, hello python@gmail.com, Hello World!'
pattern = r'([\w\.-]+)@([\w\.-]+)'
replacement = r'\1@XYZ.com'  # \1 refers to first group (username)

result = re.sub(pattern, replacement, text)
print(result)
hello john@XYZ.com, hello@XYZ.com, hello python@XYZ.com, Hello World!

Regular Expression Flags

Flags modify how patterns are matched. Common flags include ?

  • re.IGNORECASE ? Makes pattern case-insensitive, so 'a' matches both 'a' and 'A'

  • re.DOTALL ? Allows dot (.) to match newline characters (\n)

  • re.MULTILINE ? Enables ^ and $ to match start/end of each line, not just the whole string

Example with Flags

import re

text = "Python PROGRAMMING"
pattern = r"python"

# Without flag
match1 = re.search(pattern, text)
print("Without IGNORECASE:", match1)

# With IGNORECASE flag
match2 = re.search(pattern, text, re.IGNORECASE)
print("With IGNORECASE:", match2.group() if match2 else None)
Without IGNORECASE: None
With IGNORECASE: Python

Conclusion

Regular expressions in Python provide powerful pattern matching capabilities through the re module. Use re.search() for finding patterns, re.findall() for extracting all matches, and re.sub() for replacements. Flags like re.IGNORECASE modify matching behavior for more flexible pattern matching.

Updated on: 2026-03-25T05:20:37+05:30

7K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements