How regular expression grouping works in Python?


We group part of a regular expression by surrounding it with parentheses. This is how we apply operators to the complete group instead of a single character.

Capturing Groups

Parentheses not only group sub-expressions but they also create backreferences. The part of the string matched by the grouped part of the regular expression, is stored in a backreference. With the help of backreferences,  we reuse parts of regular expressions. 

In practical applications, we often need regular expressions that can match any one of two or more alternatives. Also, we sometimes want a quantifier to apply to several expressions. All of these can be achieved by grouping with parentheses; and, using alternation with the vertical bar (|).

Alternation is useful when we want to match any one of several different alternatives. For example, the regex aircraft|airplane|jet will match any text that contains aircraft or airplane or jet. The same objective can be achieved using the regex air(craft|plane)|jet. 


import re
s = 'Tahiti $% Tahiti *&^ 34 Atoll'
result = re.findall(r'(\w+)', s)
print result


This gives the output

['Tahiti', 'Tahiti', '34', 'Atoll']