What are character class operations in Python?

PythonServer Side ProgrammingProgramming

Some regular expression engines allow some fancy operations within character classes. We can match characters that belong to one class but not to another (subtraction); match characters that belong both to one class and another (intersection), or match characters that belong to either of several classes (union).

The re module in Python, allows us to use the AND operator && to specify the intersection of multiple classes within a character class: […&&[…]]specifies a character class representing the intersection of two sub-classes—meaning that the character matched by the class must belong to two sub-classes. For instance, [\S&&[\D]] specifies one character that is both a non-whitespace character and a non-digit. 

Character Class Subtraction in the re module for Python
For instance, the class
[a-z--[aeiou]]matches an English lower-case consonant.

In addition, when the subtracted class does not include a range, its brackets are optional. The above can therefore also be written as [a-z--aeiou] 

Character class union in the re module for Python
In the re module for Python, to create the union of multiple character classes, we use the OR operator ||. For instance, [0||[^\W\d]] specifies a character that is either 0 or a word character that is not a digit.




raja
Published on 09-Jan-2018 18:54:22
Advertisements