How do we use Python Regular Expression named groups?


Named Groups

Most modern regular expression engines support numbered capturing groups and numbered backreferences. Long regular expressions with lots of groups and backreferences can be difficult to read and understand. More over adding or removing a capturing group in the middle of the regex disturbs the numbers of all the groups that follow the added or removed group.

Python's re module was the first to come up with a solution: named capturing groups and named backreferences. (?P<name>group) captures the match of group into the backreference "name". name must be an alphanumeric sequence starting with a letter. group can be any regular expression. You can reference the contents of the group with the named backreference (?P=name). The question mark, P, angle brackets, and equals signs are all part of the syntax. Though the syntax for the named backreference uses parentheses, it's just a backreference that doesn't do any capturing or grouping. The HTML tags example can be written as <(?P<tag>[A-Z][A-Z0-9]*)\b[^>]*>.*?</(?P=tag)>.

Updated on: 13-Jun-2020

578 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements