Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Iterate Over Words of a String in Python
In Python, iterating over words in a string is a fundamental skill for text processing and analysis. This article explores various methods to split strings into words and iterate through them efficiently.
Using split() Method
Syntax
string.split(separator, maxsplit)
The split() method takes two optional parameters: separator and maxsplit. By default, the separator is any whitespace, and maxsplit is ?1, which means the method will split the string at every occurrence of the separator.
Example
text = "Welcome to tutorials point." words = text.split() print(words)
['Welcome', 'to', 'tutorials', 'point.']
In this example, we use the split() method to separate the words in the given string.
Using For Loop with split() Method
To process each word individually, combine split() with a for loop ?
text = "Welcome to tutorials point."
words = text.split()
for word in words:
print(word)
Welcome to tutorials point.
Using List Comprehensions
List comprehensions provide a concise way to create lists while iterating ?
text = "Learn Python for data analysis." words = [word for word in text.split()] print(words)
['Learn', 'Python', 'for', 'data', 'analysis.']
Using Regular Expressions (re module)
For strings with punctuation, the re module provides pattern-based word extraction ?
import re text = "Welcome: reader& author." words = re.findall(r'\w+', text) print(words)
['Welcome', 'reader', 'author']
The pattern \w+ matches sequences of word characters (letters, digits, and underscores), automatically excluding punctuation.
Using Generator Expressions
Generator expressions are memory-efficient for large texts, creating words on-demand ?
text = "Welcome to TutorialsPoint."
word_gen = (word for word in text.split())
for word in word_gen:
print(word)
Welcome to TutorialsPoint.
Using string Module to Remove Punctuation
The string module helps remove punctuation marks from words ?
import string text = "Welcome to TutorialsPoint." words = [word.strip(string.punctuation) for word in text.split()] print(words)
['Welcome', 'to', 'TutorialsPoint']
Comparison of Methods
| Method | Best For | Handles Punctuation? | Memory Efficient? |
|---|---|---|---|
split() |
Simple text without punctuation | No | Yes |
re.findall() |
Complex text with punctuation | Yes | Yes |
| Generator expressions | Large texts or streaming data | Depends on implementation | Very efficient |
string.punctuation |
Removing specific punctuation | Partial | Yes |
Conclusion
Use split() for simple text processing, re.findall() for complex patterns, and generator expressions for memory-efficient iteration over large texts. Choose the method based on your specific requirements for punctuation handling and performance.
