- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to use regular expressions (Regex) to filter valid emails in a Pandas series?
A regular expression is a sequence of characters that define a search pattern. In this program, we will use these regular expressions to filter valid and invalid emails.
We will define a Pandas series with different emails and check which email is valid. We will also use a python library called re which is used for regex purposes.
Algorithm
Step 1: Define a Pandas series of different email ids. Step 2: Define a regex for checking validity of emails. Step 3: Use the re.search() function in the re library for checking the validity of the email.
Example Code
import pandas as pd import re series = pd.Series(['jimmyadams123@gmail.com', 'hellowolrd.com']) regex = '^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$' for email in series: if re.search(regex, email): print("{}: Valid Email".format(email)) else: print("{} : Invalid Email".format(email))
Output
jimmyadams123@gmail.com: Valid Email hellowolrd.com : Invalid Email
Explanation
The regex variable has the following symbols:
- ^: Anchor for the start of the string
- [ ]: Opening and closing square brackets define a character class to match a single character
- \ : Escape character
- . : The dot matches any character except the newline symbol
- {} : The opening and closing curly brackets are used for range definition
- $ : The dollar sign is the anchor for the end of the string
Advertisements