Beautiful Soup - Functions Reference

Beautiful Soup Useful Resources

Beautiful Soup - clear() Method



Method Description

The clear() method in Beautiful Soup library removes the inner content of a tag, keeping the tag intact. If there are any child elements, extract() method is called on them. If decompose argument is set to True, then decompose() method is called instead of extract().

Syntax

clear(decompose=False)

Parameters

  • decompose − If this is True, decompose() (a more destructive method) will be called instead of extract()

Return Value

The clear() method doesn't return any object.

Example - Clearing entire document

As clear() method is called on the soup object that represents the entire document, all the content is removed, leaving the document blank.

html = '''
<html>
   <body>
      <p>The quick, brown fox jumps over a lazy dog.</p>
      <p>DJs flock by when MTV ax quiz prog.</p>
      <p>Junk MTV quiz graced by fox whelps.</p>
      <p>Bawds jog, flick quartz, vex nymphs.</p>
   </body>
</html>
'''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")
soup.clear()
print(soup)

Output


Example - Clearing p tags

In the following example, we find all the <p> tags and call clear() method on each of them.

html = '''
<html>
   <body>
      <p>The quick, brown fox jumps over a lazy dog.</p>
      <p>DJs flock by when MTV ax quiz prog.</p>
      <p>Junk MTV quiz graced by fox whelps.</p>
      <p>Bawds jog, flick quartz, vex nymphs.</p>
   </body>
</html>
'''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")
tags = soup.find_all('p')
for tag in tags:
   tag.clear() 

print(soup)

Output

Contents of each <p> .. </p> will be removed, the tags will be retained.

<html>
<body>
<p></p>
<p></p>
<p></p>
<p></p>
</body>
</html>

Example - Clearing the Tags with decompose argument as true

Here we clear the contents of <body> tags with decompose argument set to Tue.

html = '''
<html>
   <body>
      <p>The quick, brown fox jumps over a lazy dog.</p>
      <p>DJs flock by when MTV ax quiz prog.</p>
      <p>Junk MTV quiz graced by fox whelps.</p>
      <p>Bawds jog, flick quartz, vex nymphs.</p>
   </body>
</html>
'''
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, "html.parser")
tags = soup.find('body')
ret = tags.clear(decompose=True)

print(soup)

Output

<html>
<body></body>
</html>
Advertisements