# How to use regular expressions (Regex) to filter valid emails in a Pandas series?

PythonPandasServer Side ProgrammingProgramming

A regular expression is a sequence of characters that define a search pattern. In this program, we will use these regular expressions to filter valid and invalid emails.

We will define a Pandas series with different emails and check which email is valid. We will also use a python library called re which is used for regex purposes.

## Algorithm

Step 1: Define a Pandas series of different email ids.
Step 2: Define a regex for checking validity of emails.
Step 3: Use the re.search() function in the re library for checking the validity of the email.

## Example Code

import pandas as pd
import re

regex = '^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}\$'
for email in series:
if re.search(regex, email):
print("{}: Valid Email".format(email))
else:
print("{} : Invalid Email".format(email))

## Output

jimmyadams123@gmail.com: Valid Email
hellowolrd.com : Invalid Email

## Explanation

The regex variable has the following symbols:

• ^: Anchor for the start of the string
• [ ]: Opening and closing square brackets define a character class to match a single character
• : Escape character
• : The dot matches any character except the newline symbol
• {} : The opening and closing curly brackets are used for range definition
• :  The dollar sign is the anchor for the end of the string
Updated on 16-Mar-2021 11:00:23