We group part of a regular expression by enclosing it in a pair of parentheses. This way we apply operators to the group instead of a single character.
Parentheses not only group sub-expressions but they also create backreferences. The part of the string matched by the grouped part of the regular expression, is stored in a backreference. With the use of backreferences we reuse parts of regular expressions.
If sub-expression is placed in parentheses, it can be accessed with \1 or $1 and so on.
For example, the regex \b(\w+)\b\s+\1\b matches repeated words, such as tahiti tahiti, because the parentheses in (\w+) capture a word to Group 1 then the back-reference \1 matches the characters that were captured by Group 1.
import re s = 'Tahiti Tahiti Atoll' result = re.findall(r'\b(\w+)\b\s+\1\b', s) print result
This gives the output