How to Match patterns and strings using the RegEx module in Python


Introduction

The RegEx module stands for Regular expressions. If you have already worked on programming, you would have come across this term several times already. We use the Regular expressions to search and replace, it is used in various text editors, search engines, word processors, etc.

In other words, it helps match a certain pattern that you are looking for.

A good example for this would be how your collage website allows you to use only your university mail and none of the other extensions.

Getting Started

The regular expression module comes packaged within Python. You do not need to download and install it separately.

In order to start accessing its contents, we must first import the module. To import the RegEx module, we use

import re

Exploring the different functions

The RegEx module comes with a lot of functions and it is essential to understand and know the difference between each one of them.

Mentioned below are a few of the important functions you will most certainly use when you start working on Python projects.

Example

re.compile(pattern, flags) #Compiles the pattern to be matched
re.search(pattern, string, flags) #Searches through the string for exact match
re.match(pattern, string, flags) #Checks if there is a match between pattern and string
re.split(pattern, string, max, flag) #Splits the string based on the pattern provided
re.findall(pattern, string, flag) #Prints all the matches found using the pattern
re.finditer(pattern, string, flags) #Returns the string as an iterable object
re.sub(pattern, repl, string, count) #Replaces the string with the pattern
re.subn(pattern, repl, string, count) #Does the same thing as re.sub but returns it in a tuple(string and count)
re.escape(pattern) #Escapes all characters other than ascii characters

re.compile and re.match functions

Let us take a string, say “Hello world”. Now, let us find out if the above string is present in the string “Hello world! How are things going?”

To do this, we use the re.compile and re.match functions.

x = re.compile(“Hello world”)
y = x.match(“Hello world! How are things going?”)
if (y):
   print("Strings match")
else:
   print("Strings do not match")

Output

Strings match

If you are wondering, why we cannot do this without using the compile function, you are right! We can do this without using the compile function.

x = re.match(“Hello world”,"Hello world! How are things going?")
if (y):
   print("Strings match")
else:
   print("Strings do not match")

Output

String match

re.split function

x = re.split("\W+","Hello,World")
print(x)
x = re.split("(\W+)","Hello,World
print(x)

Output

['Hello', 'World']
['Hello', ',', 'World']

In the above example, the “\W+” basically means start splitting from the left and the + sign means keep moving forward until the end. When it is covered in brackets like in case 2, it splits and adds punctuations as well, like the comma.

re.sub and re.subn functions

x = re.sub(r"there","World","Hello there. Python is fun.")
print(x)

x = re.subn(r"there","World","Hello there. Python is fun. Hello there")
print(x)

Output

Hello World. Python is fun.
('Hello World. Python is fun. Hello World', 2)

In the above example, the re.sub checks if the word “there” exists and replaces it with “world”.

The subn function does the exact same thing but returns a tuple instead of a string and also adds in the total number of replacements done.

Real-world Example

One of the real world application/example for using the RegEx module would be to validate passwords.

import re
matching_sequence = r"[0−9]"
while(True):
   x = input("Enter your password : ")
   r = re.search(matching_sequence,x)
   if (r and len(x)>6):
      print(x + " is a valid password")
   else:
      print(x + " is not a valid password. Password MUST be atleast 7 characters with atleast 1 number")
   input("Press Enter key to exit ")

The program will check if you have entered a valid password (7+ characters with at least one number) or not.

Conclusion

You’ve learnt the basics of the RegEx module present in Python and all the various different functions present within it.

There are a lot more functions and uses for the RegEx module. If you are interested, you can read more from their official documentation at https://docs.python.org/3/library/re.html.

Updated on: 11-Feb-2021

151 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements