Beautiful Soup - string Property
Description
In Beautiful Soup, the soup and Tag object has a convenience property - string property. It returns a single string within a PageElement, Soup or Tag. If this element has a single string child, then a NavigableString corresponding to it is returned. If this element has one child tag, return value is the 'string' attribute of the child tag, and if element itself is a string, (with no children), then the string property returns None.
Syntax
Tag.string
Example - String Property of First Tag
The following code has the HTML string with a <div> tag that encloses three <p> elements. We find the string property of first <p> tag.
from bs4 import BeautifulSoup, NavigableString
markup = '''
<div id="Languages">
<p>Java</p> <p>Python</p> <p>C++</p>
</div>
'''
soup = BeautifulSoup(markup, 'html.parser')
tag = soup.p
navstr = tag.string
print (navstr, type(navstr))
nav_str = str(navstr)
print (nav_str, type(nav_str))
Output
Java <class 'bs4.element.NavigableString'> Java <class 'str'>
The string property returns a NavigableString. It can be cast to a regular Python string with str() function
Example - String property of inner children
The string property of an element with children elements inside, returns None. Check with the <div> tag.
from bs4 import BeautifulSoup, NavigableString
markup = '''
<div id="Languages">
<p>Java</p> <p>Python</p> <p>C++</p>
</div>
'''
soup = BeautifulSoup(markup, 'html.parser')
tag = soup.div
navstr = tag.string
print (navstr)
Output
None