Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Get tag name using Beautifulsoup in Python
BeautifulSoup is one of the most widely used Python packages for web scraping and parsing HTML and XML documents. One of the most common tasks when working with HTML and XML documents is extracting the tag name of specific elements.
Python's BeautifulSoup library can be installed using the below command ?
pip install beautifulsoup4
Using the name Attribute
The most straightforward method to get tag names in BeautifulSoup is using the name attribute of the Tag object. This attribute returns a string value containing the name of the tag.
Syntax
tag.name
Return Type: String value containing the name of the tag.
Example 1: Basic Tag Name Extraction
This example demonstrates extracting the tag name from an HTML document ?
from bs4 import BeautifulSoup
# HTML document to be parsed
html_doc = """
<html>
<head>
<title>TutorialsPoint</title>
</head>
<body>
<p>TutorialsPoint</p>
</body>
</html>
"""
# Parse the HTML document using BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')
# Get the first <p> tag in the HTML document
p_tag = soup.find('p')
# Get the tag name using the name attribute
tag_name = p_tag.name
# Print the tag name
print("Tag name is:", tag_name)
Tag name is: p
Example 2: XML Document Parsing
In this example, we parse an XML document and extract the tag name from a custom tag ?
from bs4 import BeautifulSoup
xml_doc = '''
<book>
<title>Harry Potter</title>
<author>J.K. Rowling</author>
<publisher>Bloomsbury</publisher>
</book>
'''
# Parse the XML document using BeautifulSoup
soup = BeautifulSoup(xml_doc, 'xml')
# Get the first <author> tag in the XML document
tag = soup.find('author')
# Get the tag name using the name attribute
tag_name = tag.name
# Print the tag name
print("Tag name is:", tag_name)
Tag name is: author
Example 3: Finding Tag by Class
This example shows how to get the tag name after finding an element by its class attribute ?
from bs4 import BeautifulSoup
# HTML document to be parsed
html_doc = """
<html>
<head>
<title class="tut">TutorialsPoint</title>
</head>
<body>
<p>TutorialsPoint</p>
</body>
</html>
"""
# Parse the HTML document using BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')
# Get the tag using its class
tag_element = soup.find(class_='tut')
# Get the tag name using the name attribute
tag_name = tag_element.name
# Print the tag name
print("Tag name is:", tag_name)
Tag name is: title
Example 4: Finding Tag by ID
This example demonstrates getting the tag name after finding an element by its ID attribute ?
from bs4 import BeautifulSoup
# HTML document to be parsed
html_doc = """
<html>
<head>
<title id="tut">TutorialsPoint</title>
</head>
<body>
<p>TutorialsPoint</p>
</body>
</html>
"""
# Parse the HTML document using BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')
# Get the tag using its id
tag_element = soup.find(id='tut')
# Get the tag name using the name attribute
tag_name = tag_element.name
# Print the tag name
print("Tag name is:", tag_name)
Tag name is: title
Key Points
The
nameattribute is the primary method to get tag names in BeautifulSoupIt works with both HTML and XML documents
You can find tags using various methods (
find(), class, ID) and then access their namesAlways ensure the tag exists before accessing its
nameattribute to avoid errors
Conclusion
BeautifulSoup's name attribute provides a simple and reliable way to extract tag names from HTML and XML documents. Whether you're finding elements by tag name, class, or ID, the name attribute consistently returns the tag name as a string.
