Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
What are repeating character classes used in Python regular expression?
A repeating character class in Python regular expressions is a character class followed by quantifiers like ?, *, or +. These quantifiers control how many times the entire character class should match.
Basic Repeating Character Classes
When you use quantifiers with character classes, they repeat the entire class, not just the specific character that was matched ?
import re
# [0-9]+ matches one or more digits (any combination)
text = "Order 579 and 333 items"
pattern = r'[0-9]+'
matches = re.findall(pattern, text)
print("Matches:", matches)
Matches: ['579', '333']
Common Quantifiers
| Quantifier | Meaning | Example |
|---|---|---|
? |
0 or 1 occurrence |
[0-9]? matches 0 or 1 digit |
* |
0 or more occurrences |
[a-z]* matches any lowercase letters |
+ |
1 or more occurrences |
[A-Z]+ matches one or more uppercase letters |
Repeating Specific Characters with Backreferences
If you want to repeat the same matched character (not the entire class), use backreferences with parentheses ?
import re
text = "Numbers: 333, 579, 2222, 999"
# ([0-9])\1+ matches repeating same digits
pattern = r'([0-9])\1+'
matches = re.findall(pattern, text)
print("Repeating digits:", matches)
# Full matches (including the repeated part)
full_matches = re.finditer(pattern, text)
for match in full_matches:
print(f"Found: '{match.group()}' at position {match.start()}")
Repeating digits: ['3', '2', '9'] Found: '333' at position 9 Found: '2222' at position 18 Found: '999' at position 24
Practical Examples
Here are common use cases for repeating character classes ?
import re
text = "Phone: 123-456-7890, Code: AAA111, Email: user@domain.com"
# Match phone numbers (digits with optional dashes)
phone_pattern = r'[0-9]{3}-[0-9]{3}-[0-9]{4}'
phone = re.findall(phone_pattern, text)
print("Phone:", phone)
# Match alphanumeric codes
code_pattern = r'[A-Z]+[0-9]+'
codes = re.findall(code_pattern, text)
print("Codes:", codes)
# Match email domains
email_pattern = r'@[a-z]+\.[a-z]+'
domains = re.findall(email_pattern, text.lower())
print("Domains:", domains)
Phone: ['123-456-7890'] Codes: ['AAA111'] Domains: ['@domain.com']
Difference Between Character Classes and Backreferences
Understanding when each approach matches is crucial ?
import re
test_string = "922226 and 579333"
# Character class [0-9]+ matches any sequence of digits
char_class = re.findall(r'[0-9]+', test_string)
print("Character class matches:", char_class)
# Backreference ([0-9])\1+ matches only repeating same digits
backreference = re.findall(r'([0-9])\1+', test_string)
print("Backreference matches (just the repeated digit):", backreference)
# To see full backreference matches
full_backrefs = [match.group() for match in re.finditer(r'([0-9])\1+', test_string)]
print("Full backreference matches:", full_backrefs)
Character class matches: ['922226', '579333'] Backreference matches (just the repeated digit): ['2', '3'] Full backreference matches: ['2222', '333']
Conclusion
Repeating character classes match any characters within the class multiple times, while backreferences match only repeated instances of the same character. Use character classes for flexible matching and backreferences for strict repetition patterns.
