How to use regular expressions (Regex) to filter valid emails in a Pandas series?

A regular expression is a sequence of characters that define a search pattern. In this program, we will use these regular expressions to filter valid and invalid emails.

We will define a Pandas series with different emails and check which email is valid. We will also use a python library called re which is used for regex purposes.


Step 1: Define a Pandas series of different email ids.
Step 2: Define a regex for checking validity of emails.
Step 3: Use the function in the re library for checking the validity of the email.

Example Code

import pandas as pd
import re

series = pd.Series(['', ''])
regex = '^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$'
for email in series:
   if, email):
      print("{}: Valid Email".format(email))
      print("{} : Invalid Email".format(email))

Output Valid Email : Invalid Email


The regex variable has the following symbols:

  • ^: Anchor for the start of the string
  • [ ]: Opening and closing square brackets define a character class to match a single character
  • : Escape character
  • : The dot matches any character except the newline symbol
  • {} : The opening and closing curly brackets are used for range definition
  • :  The dollar sign is the anchor for the end of the string