- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How regular expression back references works in Python?
Grouping
We group part of a regular expression by enclosing it in a pair of parentheses. This way we apply operators to the group instead of a single character.
Capturing Groups and Backreferences
Parentheses not only group sub-expressions but they also create backreferences. The part of the string matched by the grouped part of the regular expression, is stored in a backreference. With the use of backreferences we reuse parts of regular expressions.
If sub-expression is placed in parentheses, it can be accessed with \1 or $1 and so on.
For example, the regex \b(\w+)\b\s+\1\b matches repeated words, such as tahiti tahiti, because the parentheses in (\w+) capture a word to Group 1 then the back-reference \1 matches the characters that were captured by Group 1.
Example
import re s = 'Tahiti Tahiti Atoll' result = re.findall(r'\b(\w+)\b\s+\1\b', s) print result
Output
This gives the output
['Tahiti']