How do we find the exact positions of each match in Python's regular expression?

Regular expressions in Python allow us to search for patterns in text. When working with matches, it's often useful to know not just what matched, but exactly where each match occurred. Python's re module provides several methods to find the precise positions of matches.

Key Methods for Finding Match Positions

The following methods help locate match positions ?

  • re.finditer() Returns an iterator of match objects for all matches

  • match.start() Returns the starting position of a match

  • match.end() Returns the ending position of a match

  • match.span() Returns both start and end positions as a tuple

  • match.group() Returns the matched text

Finding All Match Positions

Use finditer() to get the position of every match in a string ?

import re

# Compile pattern for uppercase letters and digits
pattern = re.compile(r'[A-Z0-9]')

# Find all matches and their positions
text = 'A5B6C7D8'
for match in pattern.finditer(text):
    print(f"Position {match.start()}: '{match.group()}'")
Position 0: 'A'
Position 1: '5'
Position 2: 'B'
Position 3: '6'
Position 4: 'C'
Position 5: '7'
Position 6: 'D'
Position 7: '8'

Using span() for Start and End Positions

The span() method returns both start and end positions together ?

import re

# Find word positions
pattern = re.compile(r'\w+')
text = 'Hello world Python'

for match in pattern.finditer(text):
    start, end = match.span()
    word = match.group()
    print(f"'{word}' found at positions {start}-{end}")
'Hello' found at positions 0-5
'world' found at positions 6-11
'Python' found at positions 12-18

Finding Positions with Groups

When using groups in patterns, you can get positions for specific groups ?

import re

# Pattern with groups for phone number
phone_pattern = re.compile(r'(\d{3})-(\d{3}-\d{4})')
text = 'Call me at 415-555-4242 or 888-123-9876'

for match in phone_pattern.finditer(text):
    print(f"Full match: {match.group()} at position {match.start()}-{match.end()}")
    print(f"Area code: {match.group(1)} at position {match.start(1)}-{match.end(1)}")
    print(f"Number: {match.group(2)} at position {match.start(2)}-{match.end(2)}")
    print("---")
Full match: 415-555-4242 at position 11-23
Area code: 415 at position 11-14
Number: 555-4242 at position 15-23
---
Full match: 888-123-9876 at position 27-39
Area code: 888 at position 27-30
Number: 123-9876 at position 31-39
---

Comparison of Methods

Method Returns Best For
finditer() Iterator of match objects Multiple matches with positions
search() First match object only Single match with position
findall() List of matched strings All matches without positions

Conclusion

Use finditer() with start(), end(), and span() methods to find exact positions of regex matches. This approach is memory-efficient for large texts and provides complete match information including positions and grouped content.

---
Updated on: 2026-03-26T21:49:14+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements