Article Categories

Selected Reading

How to match whitespace but not newlines using Python regular expressions?

Python Server Side Programming Programming

In Python, regular expressions (regex) provide powerful tools to search and manipulate strings. When you need to match whitespace characters like spaces and tabs but exclude newline characters, you can use specific regex patterns with Python's re module.

The following methods demonstrate how to match whitespace but not newlines using Python regular expressions ?

Using re.sub() Method
Using re.findall() Method
Using Character Classes
Using Positive Lookahead

Using re.sub() Method

The re.sub() method efficiently replaces whitespace characters (excluding newlines) within a string. The regex pattern [ \t]+ matches one or more occurrences of spaces or tabs, preserving newlines in the original text.

Example

The following example replaces multiple spaces and tabs with a single space while keeping newlines intact ?

import re

text = "This  is  a  sample  text.\nIt  contains  multiple   spaces.\n\tAnd tabs too."
normalized_text = re.sub(r'[ \t]+', ' ', text)

print("Original text:")
print(repr(text))
print("\nNormalized text:")
print(repr(normalized_text))
print("\nFormatted output:")
print(normalized_text)

The output of the above code is ?

Original text:
'This  is  a  sample  text.\nIt  contains  multiple   spaces.\n\tAnd tabs too.'

Normalized text:
'This is a sample text.\nIt contains multiple spaces.\n And tabs too.'

Formatted output:
This is a sample text.
It contains multiple spaces.
 And tabs too.

Using re.findall() Method

The re.findall() method returns a list of all whitespace sequences (excluding newlines) found in the string. This is useful for analyzing whitespace patterns in text.

Example

The following example finds all whitespace sequences using the [ \t]+ pattern ?

import re

text = "This  is  a  sample  text.\nIt contains  multiple   spaces."
whitespace_matches = re.findall(r'[ \t]+', text)

print("Text:", repr(text))
print("Whitespace matches:", whitespace_matches)
print("Number of matches:", len(whitespace_matches))

The output of the above code is ?

Text: 'This  is  a  sample  text.\nIt contains  multiple   spaces.'
Whitespace matches: ['  ', '  ', '  ', '  ', '   ']
Number of matches: 5

Using Character Classes

Character classes provide more precise control over whitespace matching. The pattern [^\S\n] matches any whitespace character except newlines, offering an alternative to explicit space and tab matching.

Example

The following example demonstrates different character class patterns for matching whitespace ?

import re

text = "Hello\t\tworld    with\nspaces and\ttabs."

# Method 1: Explicit space and tab
method1 = re.findall(r'[ \t]+', text)

# Method 2: Non-whitespace negation excluding newline
method2 = re.findall(r'[^\S\n]+', text)

print("Text:", repr(text))
print("Method 1 [ \t]+:", method1)
print("Method 2 [^\S\n]+:", method2)
print("Results match:", method1 == method2)

The output of the above code is ?

Text: 'Hello\t\tworld    with\nspaces and\ttabs.'
Method 1 [ \t]+: ['\t\t', '    ', ' ', '\t']
Method 2 [^\S\n]+: ['\t\t', '    ', ' ', '\t']
Results match: True

Using Positive Lookahead

Positive lookahead (?=pattern) checks what follows without including it in the match. This technique helps find whitespace followed by specific characters while excluding newlines from the match.

Example

The following example finds whitespace followed by non-whitespace characters using positive lookahead ?

import re

text = "This is a test.\nNew line here.\tAnother line."

# Find whitespace followed by non-whitespace characters
matches = re.findall(r'[ \t]+(?=\S)', text)
print("Whitespace before non-whitespace:", matches)

# Find whitespace at end of lines (before newline or end of string)
end_matches = re.findall(r'[ \t]+(?=\n|$)', text)
print("Whitespace at line ends:", end_matches)

# Replace whitespace sequences with single space using lookahead
cleaned = re.sub(r'[ \t]+(?=\S)', ' ', text)
print("Cleaned text:", repr(cleaned))

The output of the above code is ?

Whitespace before non-whitespace: [' ', ' ', ' ', ' ', '\t']
Whitespace at line ends: []
Cleaned text: 'This is a test.\nNew line here. Another line.'

Comparison of Methods

Method	Pattern	Best For	Preserves Newlines?
`re.sub()`	`[ \t]+`	Replacing whitespace	Yes
`re.findall()`	`[ \t]+`	Finding whitespace patterns	Yes
Character classes	`[^\S\n]+`	Complex whitespace matching	Yes
Positive lookahead	`[ \t]+(?=\S)`	Conditional matching	Yes

Conclusion

Use the [ \t]+ pattern with re.sub() for simple whitespace replacement. For more complex scenarios, character classes like [^\S\n]+ or lookahead patterns provide greater control while preserving newline characters.

SaiKrishna Tavva

Updated on: 2026-03-24T19:17:50+05:30

2K+ Views

Previous Next

Article Categories

How to match whitespace but not newlines using Python regular expressions?

Using re.sub() Method

Example

Using re.findall() Method

Example

Using Character Classes

Example

Using Positive Lookahead

Example

Comparison of Methods

Conclusion

Learn More in Our Tutorials