Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to match a non-whitespace character in Python using Regular Expression?
A non-whitespace character is any character that is not a space, tab i.e., \t, newline i.e., \n or carriage return \r. Examples of non-whitespace characters include letters, digits, punctuation and special symbols.
In Python, Regular Expressions (RegEx) are used to match patterns in input strings. The re module in Python provides different methods to use regular expressions. This module helps developers perform search operations, validations, filtering and much more based on string patterns.
To match a non-whitespace character in Python we can use the special character class \S inside a raw string with any re module method. In this article, let's see different methods to match non-whitespace characters using Regular Expressions.
Finding First Non-Whitespace Character
In this example, we will use the method re.search() by passing \S inside a raw string to find the first non-whitespace character ?
import re
text = " \t \nHello, Tutorialspoint!"
match = re.search(r"\S", text)
if match:
print("First non-whitespace character:", match.group())
print("Character position in string:", match.start())
else:
print("No non-whitespace character found.")
The output of the above code is ?
First non-whitespace character: H Character position in string: 8
Finding All Non-Whitespace Characters
We can use re.findall() with the \S pattern to find and return all non-whitespace characters from the input string ?
import re
text = " T u t o r i a l s \t p o i n t \n 2025 "
matches = re.findall(r"\S", text)
print("All non-whitespace characters:", matches)
The output of the above code is ?
All non-whitespace characters: ['T', 'u', 't', 'o', 'r', 'i', 'a', 'l', 's', 'p', 'o', 'i', 'n', 't', '2', '0', '2', '5']
Finding Non-Whitespace Words
To find complete words (sequences of non-whitespace characters), use the \S+ pattern with re.findall() ?
import re
text = "Hello World!\n\tPython 2025"
words = re.findall(r"\S+", text)
print("Non-whitespace words:", words)
The output of the above code is ?
Non-whitespace words: ['Hello', 'World!', 'Python', '2025']
Whitespace vs Non-Whitespace Patterns
In Python regular expressions, different character classes are used to identify whitespace and non-whitespace characters. Below is a comparison of common patterns ?
| Pattern | Description | Matches |
|---|---|---|
| \s | Matches any whitespace character | Space, tab, newline, carriage return |
| \S | Matches any non-whitespace character | Letters, digits, punctuation, symbols |
| \w | Matches any word character | Letters, digits, underscore (_) |
| \W | Matches any non-word character | Spaces, symbols, punctuation |
Practical Example: Text Validation
Here's a practical example that validates if a string contains any non-whitespace characters ?
import re
def has_content(text):
"""Check if string contains non-whitespace characters"""
return bool(re.search(r"\S", text))
# Test different strings
test_strings = [
" ", # Only spaces
"\t\n\r", # Only whitespace chars
" Hello ", # Contains content
"", # Empty string
"123" # Numbers
]
for string in test_strings:
result = has_content(string)
print(f"'{string}' has content: {result}")
The output of the above code is ?
' ' has content: False ' ' has content: False ' Hello ' has content: True '' has content: False '123' has content: True
Conclusion
Use \S to match any non-whitespace character in Python regular expressions. Use \S+ to match complete words. The re.search() finds the first match while re.findall() returns all matches as a list.
