Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to divide a string by line break or period with Python regular expressions?
When working with text processing in Python, you often need to split strings by multiple delimiters like periods and line breaks. Python's regular expressions module re provides powerful pattern matching for this purpose.
Using re.findall() to Split by Period and Line Break
The re.findall() function extracts all substrings that match a given pattern. Here's how to split a string by periods and line breaks −
import re s = """Hi. It's nice meeting you. My name is Jason.""" result = re.findall(r'[^\s\.][^\.\n]+', s) print(result)
['Hi', "It's nice meeting you", 'My name is Jason']
How the Regular Expression Works
Let's break down the pattern r'[^\s\.][^\.\n]+' −
-
[^\s\.]− Matches any character except whitespace (\s) or period (\.) -
[^\.\n]+− Matches one or more characters except period or newline (\n)
Alternative Approach Using re.split()
You can also use re.split() with a filter to remove empty strings −
import re text = """Hi. It's nice meeting you. My name is Jason.""" # Split by period or newline, then filter empty strings parts = re.split(r'[.\n]', text) result = [part.strip() for part in parts if part.strip()] print(result)
['Hi', "It's nice meeting you", 'My name is Jason']
Comparison
| Method | Pros | Cons |
|---|---|---|
re.findall() |
Direct extraction, no filtering needed | Complex pattern required |
re.split() |
Simple pattern, flexible | Requires filtering empty strings |
Conclusion
Use re.findall() for direct pattern matching when you know exactly what you want to extract. Use re.split() with filtering for simpler patterns and more flexibility in handling delimiters.
Advertisements
