Beautiful Soup - decode() Method



Method Description

The decode() method in Beautiful Soup returns a string or Unicode representation of the parse tree as an HTML or XML document. The method decodes the bytes using the codec registered for encoding. Its function is opposite to that of encode() method. You call encode() to get a bytestring, and decode() to get Unicode. Let us study decode() method with some examples.

Syntax

decode(pretty_print, encoding, formatter, errors)

Parameters

  • pretty_print − If this is True, indentation will be used to make the document more readable.

  • encoding − The encoding of the final document. If this is None, the document will be a Unicode string.

  • formatter − A Formatter object, or a string naming one of the standard formatters.

  • errors − The error handling scheme to use for the handling of decoding errors. Values are 'strict', 'ignore' and 'replace'.

Return Value

The decode() method returns a Unicode String.

Example

from bs4 import BeautifulSoup

soup = BeautifulSoup("Hello “World!”", 'html.parser')
enc = soup.encode('utf-8')
print (enc)
dec = enc.decode()
print (dec)

Output

b'Hello \xe2\x80\x9cWorld!\xe2\x80\x9d'
Hello "World!"
Advertisements