Finding Words Lengths in String using Python

Python Server Side Programming Programming

Finding the lengths of individual words in a given input string using Python is the issue that has to be resolved. We want to count the characters in each word of a text input and display the results in a structured style, like a list. The task entails breaking down the input string and separating each word. The length of each word is then calculated based on the number of characters in it. The basic objective is to create a function or process that can receive input, determine word lengths, and promptly output the results effectively. In several applications, including text processing, natural language processing, and data analysis, where word length statistics can offer insightful information and enable additional analysis, resolving this issue is crucial.

Methods Used

Using a loop and the split() function
Using the map() function with len and split()
Using the re.split() method from the re module
Using a Dictionary to store word lengths

Using a loop and the split() function

A fundamental and simple Python text processing and analysis technique is the method of obtaining word lengths using a loop and the split() function. By dissecting an input string into its constituent words using this technique, we can determine the length of each word by repeatedly going over the list of words. In this method, the split() function is essential since it essentially tokenizes the input text based on space characters, separating words from one another. The list of words that results serves as the basis for the rest of our word length computations. This technique is a crucial part of any text analysis toolset because of its effectiveness and simplicity in comprehending and extracting significant information from textual input.

Algorithm

Define a function findWordLengths that takes an input string as a parameter.
Use the split() function to split the input string into individual words, using spaces as the delimiter and thus it separates them. Store the result in a list called words.
Initialise an empty list ‘wordLengths’ to store the lengths of the words.
For each word in the list of words, do the following:

a. Calculate the length of the current word using the len() function.

b. Append the length of the current word to the ‘wordLengths’ list.

Return the ‘wordLengths’ list.

Example

def findWordLengths(input_string):
   words = input_string.split()
   wordLengths = []
   for word in words:
      wordLengths.append(len(word))
   return wordLengths


input_string = "Today feels like a productive day"
result = findWordLengths(input_string)
print("The words lengths are as:", result)

Output

The words lengths are as: [5, 5, 4, 1, 10, 3]

Using the map() function with len and split()

The method for determining word lengths in Python that combines the map() function with len and split() provides an elegant and effective answer to the problem of handling text data. This approach makes use of the robust features of the built-in functions in Python, combining them to quickly determine the lengths of individual words in an input string. The split() method, which effectively separates the input string into individual words using spaces as delimiters, is the foundation of this strategy. A list of terms is created as a result, serving as the foundation for more investigation. Leveraging the map() method, we change this list into an iterable that contains the lengths of all the words by applying the len() function to each word.

Algorithm

Define a function findWordLengths that takes an input string as its parameter. The parameter will contain our input string.
Use the split() function to split the input string into individual words, using spaces as the delimiter. Store the result in a list called words.
Use the map() function to apply the len() function to each word in the list of words as len() function basically calculates length. This will result in an iterable containing the lengths of individual words.
Convert the iterable obtained from map() to a list to get the lengths of individual words.
Return the list of word lengths.

Example

def findWordLengths(input_string):
   words = input_string.split()
   wordLengths = list(map(len, words))
   return wordLengths


input_string = "Hello my name is Rahul"
result = findWordLengths(input_string)
print("The words lengths are as:", result)

Output

The words lengths are as: [5, 2, 4, 2, 5]

Using the re.split() method from the re module

Regular expressions are an effective tool for text modification and pattern matching. By intelligently detecting whitespace characters as word separators, the re.split() function acts as a reliable technique to tokenize the phrase into words in this situation. In order to accurately segment words, the regex pattern r's+' matches one or more consecutive whitespace letters. This method performs exceptionally well when processing diverse input text formats, allowing different whitespace patterns, and providing accurate word length computations. Its use of regular expressions makes processing even with large amounts of input data efficient. Additionally, the lack of other libraries simplifies implementation.

Algorithm

Input the sentence containing words.
Use the re.split() method with the regular expression r'\s+' to split the sentence into words.
Initialise an empty list to store the word lengths.
Iterate through each word in the list of words:
a. Calculate the length of the current word.
b. Append the word length to the list of word lengths.
Output the list of word lengths.

Example

import re

def word_lengths(sentence):
   
   words = re.split(r'\s+', sentence)
   
   word_lengths = []
   
   for word in words:
      length = len(word)
      word_lengths.append(length)
   
   return word_lengths

sentence = "This is a sample sentence"
result = word_lengths(sentence)
print(result)

Output

[4, 2, 1, 6, 8]

Using a Dictionary to store word lengths

The method for determining word lengths in a string that uses a dictionary to hold word lengths combines the effectiveness of Python's built-in data structures with a simple algorithm to provide a potent and adaptable result. The technique generates key-value pairs for each word in the dictionary by iteratively calculating its length using the len() function. The length of the word is given as the matching value, and the word is set as the key. The collecting of word lengths is precise and effective thanks to this clear and simple procedure.

The generated dictionary provides word lengths corresponding to each word, capturing an extensive view of the text. For further data processing or analysis, this format makes it simple to retrieve word lengths.

Algorithm

Input the sentence containing words.
Split the sentence into words using the split() method, which will create a list of words for our input string.
Initialise an empty dictionary to store word lengths.
Iterate through each word in the list of words as below:
a. Calculate the length of the current word.
b. Add an entry to the dictionary with the word as the key and its length as the value.
Output the dictionary containing word lengths.

Example

def word_lengths(sentence):

   words = sentence.split()
   word_lengths_dict = {}
   
   for word in words:
      length = len(word)
      word_lengths_dict[word] = length
   
   return word_lengths_dict

sentence = "This is how the length of words is calculated"
result = word_lengths(sentence)
print(result)

Output

{'This': 4, 'is': 2, 'how': 3, 'the': 3, 'length': 6, 'of': 2, 'words': 5, 'is': 2, 'calculated': 10}

Conclusion

So, to conclude we examined four distinct Python methods for calculating word lengths in a string. For straightforward jobs, the first two approaches—using loops and the split() function or the map() function with len and split()—offer simplicity and effectiveness. The third option, which makes use of the re.split() method from the re module, shows off regular expressions' power and is suited for use in sophisticated text processing applications. Last but not least, using a dictionary to keep track of word lengths offers a systematic and practical approach to retrieve and process word data. Each approach offers particular benefits, enabling programmers to choose the best option depending on particular needs and coding preferences.

Aayush Shukla

Updated on: 03-Aug-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started