Beautiful Soup - strings Property



Method Description

For any PageElement having more than one children, the inner text of each can be fetched by the strings property. Unlike the string property, strings handles the case when the element contains multiple children. The strings property returns a generator object. It yields a sequence of NavigableStrings corresponding to each of the child elements.

Syntax

Tag.strings

Example 1

You can retrieve the value od strings property for soup as well as a tag object. In the following example, the soup object's stings property is checked.

from bs4 import BeautifulSoup, NavigableString

markup = '''
   <div id="Languages">
      <p>Java</p> <p>Python</p> <p>C++</p>
   </div>
'''
soup = BeautifulSoup(markup, 'html.parser')
print ([string for string in soup.strings])

Output

['\n', '\n', 'Java', ' ', 'Python', ' ', 'C++', '\n', '\n']

Note the line breaks and white spaces in the list.We can remove them with stripped_strings property.

Example 2

We now obtain a generator object returned by the strings property of <div> tag. With a loop, we print the strings.

tag = soup.div

navstrs = tag.strings
for navstr in navstrs:
   print (navstr)

Output

Java
 
Python
 
C++

Note that the line breaks and whiteapces have appeared in the output, which can be removed with stripped_strings property.

Advertisements