What is a regular expression in Python?


A regular expression, also known as a regex or regexp, is a sequence of characters that define a search pattern. Regular expressions are commonly used in Python and other programming languages to match and manipulate text. In Python, regular expressions are implemented using the re module.

In simple words, a regular expression is a sequence of character(s) mainly used to find and replace patterns in a string or file. They are supported by most of the programming languages like python, perl, R, Java and so on.

Regular expressions are very useful in extracting information from text such as code, log files, spreadsheets, or even documents. We deal more with the practical uses of regular expressions.

The first thing to know when using regular expressions is that everything is basically a character, and we write patterns to match a specific sequence of characters (also known as a string). Most patterns use normal ASCII, which includes letters, digits, punctuation and other symbols on computer keyboards like %#$@!, but unicode characters can also be used to match any type of international text.

In Python, there is a module “re” that works on regular expressions. So you need to import library re before you can use regular expressions in Python.

The most common uses of regular expressions are −

Search a string (search and match)

Finding a string (findall)

Break string into a sub strings (split)

Replace part of a string (sub)

Here are some examples of how regular expressions can be used in Python −

Match a specific string pattern

Example

import re
text = "The dog jumps over the lazy cat"
pattern = "dog"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Match any single character using a dot .:

Example

import re
text = " The dog jumps over the lazy cat "
pattern = ".at"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Match any character in a set of characters using square brackets []:

Example

import re
text = "The quick dog jumps over the lazy cat"
pattern = "[aeiou]"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Match any character that is not in a set of characters using [^]:

Example

import re
text = "The quick dog jumps over the lazy cat"
pattern = "[^aeiou]"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Example

import re
text = "The quick dog jumps over the lazy cat"
pattern = "o{2}"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match not found

Match a specific pattern that appears zero or one time using ?:

Example

import re
text = "The quick brown dog jumps over the lazy cat"
pattern = "brown(ie)?"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Match a string that starts with a specific pattern using ^:

Example

import re
text = "The quick brown dog jumps over the lazy cat"
pattern = "^The"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Match a string that ends with a specific pattern using $:

Example

import re
text = "The quick brown dog jumps over the lazy cat"
pattern = "cat$"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Use parentheses () to group patterns and apply quantifiers to the group:

Example

import re
text = "The quick brown dog jumps over the lazy cat"
pattern = "(dog)+"
if re.search(pattern, text):
   print("Match found!")
else:
   print("Match not found.")

Output

The above code produces the following output

Match found!

Replace text that matches a pattern using re.sub():

Example

import re
text = "The quick brown dog jumps over the lazy cat"
pattern = "dog"
replace_with = "fox"
new_text = re.sub(pattern, replace_with, text)
print(new_text)

Output

The above code produces the following output

The quick brown fox jumps over the lazy cat

Use the re.split() function to split a string using a regular expression:

Example

import re
text = "The quick brown dog jumps over the lazy cat"
pattern = "\s"
words = re.split(pattern, text)
print(words)

Output

The above code produces the following output

['The', 'quick', 'brown', 'dog', 'jumps', 'over', 'the', 'lazy', 'cat']

In summary, regular expressions are a powerful tool for pattern matching and text manipulation in Python. The re module provides a wide range of functions for working with regular expressions, including searching, replacing, and splitting strings.

Updated on: 02-May-2023

152 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements