Get tag name using Beautifulsoup in Python

Python Server Side Programming Programming

BeautifulSoup is known as one of the most widely used Python packages for web scraping. It is one of the most fantastic tools used for parsing HTML and XML documents, making it simpler and quicker to extract data from websites. Extraction of the tag name for particular HTML and XML components is one of the most frequent tasks in web scraping. Getting the tag name of a given element is one of the most frequent tasks when working with HTML and XML documents.

Python's BeautifulSoup library can be installed using the below command:

pip install beautifulsoup4

Approach

Using the name attribute

Method 1: Using the name attribute

This method includes getting the tag name using BeautifulSoup, the name attribute of the Tag object. This attribute returns the string value as the name of the tag. Below is the syntax of the name attribute:

Syntax

tag.name

Return Type String value containing the name of the Tag.

Algorithm

Import the BeautifulSoup module.
Define an HTML multi-line string that will be used to get the tag from.
Create a BeautifulSoup object by supplying the HTML document and a parser as inputs to the BeautifulSoup constructor. The html.parser is being used as the parser in this case.
Find the first occurrence of <p> tag in the document using the soup.find() method.
Use the name attribute for getting the name of the p Tag object.
Print the tag name using the print() statement.

Example 1

Below are the example codes that demonstrates this approach:

from bs4 import BeautifulSoup

# HTML document to be parsed
html_doc = """
<html>
<head>
   <title>TutorialsPoint</title>
</head>
<body>
   <p>TutorialsPoint</p>
</body>
</html>
"""

# Parse the HTML document using BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

# Get the first <p> tag in the HTML document
p_tag = soup.find('p')

# Get the tag name using the name attribute
tag_name = p_tag.name

# Print the tag name
print("Tag name is:", tag_name)

Output

Tag name is: p

Example 2

In this example, we are parsing the XML document and getting the tag name from a custom tag.

from bs4 import BeautifulSoup

xml_doc = '''
<book>
    <title>Harry Potter</title>
    <author>J.K. Rowling</author>
    <publisher>Bloomsbury</publisher>
</book>
'''

# Parse the XML document using BeautifulSoup
soup = BeautifulSoup(xml_doc, 'xml')

# Get the first <author> tag in the XML document
tag = soup.find('author')

# Get the tag name using the name attribute
tag_name = tag.name

# Print the tag name
print("Tag name is:", tag_name)

Output

Tag name is: author

Example 3

In this example, we are getting the tag using its class and then applying the name attribute for getting the name of the tag.

from bs4 import BeautifulSoup

# HTML document to be parsed
html_doc = """
<html>
<head>
   <title class="tut">TutorialsPoint</title>
</head>
<body>
   <p>TutorialsPoint</p>
</body>
</html>
"""

# Parse the HTML document using BeautifulSoup constructor
soup = BeautifulSoup(html_doc, 'html.parser')

# Get the tag using its class
p_tag = soup.find(class_='tut')

# Get the tag name using the name attribute
tag_name = p_tag.name

# Print the tag name
print("Tag name is:", tag_name)

Output

Tag name is: title

Example 4

In this example, we are getting the tag using its id and then applying the name attribute for getting the name of the tag.

from bs4 import BeautifulSoup

# HTML document to be parsed
html_doc = """
<html>
<head>
   <title id="tut">TutorialsPoint</title>
</head>
<body>
   <p>TutorialsPoint</p>
</body>
</html>
"""

# Parse the HTML document using BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')

# Get the tag using its id
p_tag = soup.find(id='tut')

# Get the tag name using the name attribute
tag_name = p_tag.name

# Print the tag name
print("Tag name is:", tag_name)

Output

Tag name is: title

Conclusion

We can say that BeautifulSoup is a robust Python module that makes parsing HTML and XML texts simple. It offers a variety of tools and options for searching, navigating, and modifying the document tree.

Each example has its own advantages and disadvantages based on the method or function used. You can choose the method you want based on the complexity of the expression you want to have and your personal preference for writing the code.

Tarandeep Singh

Updated on: 2023-05-29T12:29:04+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started