Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to match anything except space and new line using Python regular expression?
Python's re module provides various tools for pattern matching using regular expressions (regex). With the help of regex, we can define flexible patterns that match or exclude particular characters or sequences. In this article, we will focus on how to match everything except spaces and newlines using regular expressions.
The following are the methods involved to match the regex pattern, except for space and new line −
- Using re.findall() Method
- Using re.split() Method
- Replace Spaces and Newlines with Empty String
Regular Expression Patterns for Non-Space Characters
Before diving into the methods, let's understand the key regex patterns used ?
- [^ \n]+ − Matches one or more characters that are NOT space or newline
- [ \n]+ − Matches one or more spaces or newlines
- \s+ − Matches any whitespace characters (spaces, tabs, newlines)
Using re.findall() Method
The re.findall() method extracts all parts of the text that don't contain a space or newline. We use the pattern r'[^ \n]+' to find all continuous sequences that exclude spaces and newlines ?
import re
text = "I find\nTutorialspoint useful"
matches = re.findall(r'[^ \n]+', text)
print("Original text:", repr(text))
print("Matched words:", matches)
The output of the above code is ?
Original text: 'I find\nTutorialspoint useful' Matched words: ['I', 'find', 'Tutorialspoint', 'useful']
Using re.split() Method
The re.split() method splits a string into a list based on a specified pattern. Using r'[ \n]+' as the pattern, it splits wherever one or more spaces or newlines occur ?
import re
data = "Python is\ngreat for regex"
parts = re.split(r'[ \n]+', data)
print("Original text:", repr(data))
print("Split result:", parts)
The output of the above code is ?
Original text: 'Python is\ngreat for regex' Split result: ['Python', 'is', 'great', 'for', 'regex']
Using re.sub() to Remove Spaces and Newlines
The re.sub() method replaces parts of a string that match a pattern with a replacement string. To remove all spaces and newlines, we replace them with an empty string ?
import re
text = "This\nhas some text \n with spaces"
clean = re.sub(r'[ \n]', '', text)
print("Original text:", repr(text))
print("After removing spaces/newlines:", clean)
The output of the above code is ?
Original text: 'This\nhas some text \n with spaces' After removing spaces/newlines: Thishassometextwithspaces
Comparison of Methods
| Method | Purpose | Pattern | Returns |
|---|---|---|---|
re.findall() |
Extract non-space words | [^ \n]+ |
List of matches |
re.split() |
Split by spaces/newlines | [ \n]+ |
List of parts |
re.sub() |
Remove spaces/newlines | [ \n] |
Modified string |
Advanced Example
Here's a practical example combining multiple approaches to process text data ?
import re
sample_text = "Hello\nWorld! Python regex\nis powerful."
# Method 1: Find all non-space words
words = re.findall(r'[^ \n]+', sample_text)
print("Words found:", words)
# Method 2: Split by whitespace
split_words = re.split(r'\s+', sample_text)
print("Split words:", split_words)
# Method 3: Remove all whitespace
no_spaces = re.sub(r'\s', '', sample_text)
print("No whitespace:", no_spaces)
The output of the above code is ?
Words found: ['Hello', 'World!', 'Python', 'regex', 'is', 'powerful.'] Split words: ['Hello', 'World!', 'Python', 'regex', 'is', 'powerful.'] No whitespace: HelloWorld!Pythonregexispowerful.
Conclusion
Use re.findall() with [^ \n]+ to extract words excluding spaces and newlines. Use re.split() to separate text into tokens, and re.sub() to remove unwanted whitespace characters entirely.
