What are character class operations in Python?

Python Server Side Programming Programming

Character class operations in Python's regular expressions allow us to define set of characters we want to match. Instead of searching for one specific character, we can search for any character within that set. A character class in regex is written using square brackets []. It defines a group of characters where any character from the group can match a part of the string.

Commonly Used Character Classes in Python (re module)

Regular expressions use both normal and special characters. Normal characters like 'A', 'a', or '0' match themselves. So, "last" (a sequence of characters) matches the string 'last'. Some characters, like '|' or '(', are special. These types of special predefined characters either stand for classes of ordinary characters or affect how the regular expressions around them are interpreted.

Repetition operators or quantifiers (*, +, ?, {m,n}, etc) cannot be directly nested. To avoid ambiguity, especially with the non-greedy (?). To repeat a pattern, you need to use parentheses. For example, (a{6})* matches any number of groups of six 'a's.

The following are the most commonly used character classes in Python regular expressions.

Character Class	Meaning	Description
.	Any character except a newline	Matches any single character except a newline character (`\n`).
\d	Digit character	Matches any numeric digit from 0 to 9 (equivalent to [0-9]).
\D	Non-digit character	Matches any character that is not a digit.
\w	Word character	Matches any alphanumeric character, including underscore.
\W	Non-word character	Matches any character that is not a letter, digit, or underscore.
\s	Whitespace character	Matches any whitespace character, such as space, tab, newline, etc.
\S	Non-whitespace character	Matches any character that is not a whitespace character.
[abc]	Matches a, b, or c	Matches any one character within the set a, b, or c.
[^abc]	Not a, b, or c	Matches any character except those listed inside the brackets.
[a-z]	Lowercase letters	Matches any lowercase letter.
[A-Z]	Uppercase letters	Matches any uppercase letter
[0-9]	Digits	Matches any digit from 0 to 9.
[\[\]]	Literal [ or ]	Matches a literal opening or closing square bracket.

Basic Example: Matching Vowels

The following is a simple program that searches for vowels in a sentence using the re.findall() function. It returns a list of all the characters that match the given pattern.

import re
text = "The quick brown fox jumps over the lazy dog."
vowels = re.findall("[aeiou]", text)
print(vowels)

Following is the output of the above code:

['e', 'u', 'i', 'o', 'o', 'u', 'e', 'o', 'e', 'a', 'o']

Matching Non-Vowels (Negated Character Class)

We can reverse a character class using the ^ inside the square brackets. For example, [^aeiou] matches any character except the vowels. In this example, we also exclude the space character, so we only get consonants and punctuation.

import re
text = "The quick brown fox jumps over the lazy dog."
consonants = re.findall("[^aeiou ]", text)
print(consonants)

Following is the output of the above code:

['T', 'h', 'q', 'c', 'k', 'b', 'r', 'w', 'n', 'f', 'x', 'j', 'm', 'p', 's', 'v', 'r', 't', 'h', 'l', 'z', 'y', 'd', 'g', '.']

Using Ranges in Character Classes

Using ranges in character classes allows you to specify a set of characters based on their alphabetical or numerical order within a regular expression. This avoids having to list each character individually.

A hyphen - inside a character class can be used to define a range of characters. For instance, [A-Z] matches any uppercase letter. In this example, only the letter 'T' is matched because it's the only capital letter in the input text.

import re
text = "The quick brown fox jumps over the lazy dog."
capital_letters = re.findall("[A-Z]", text)
print(capital_letters)

Following is the output of the above code:

['T']

Rajendra Dharmkar

Updated on: 2025-08-28T11:09:56+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started