Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to create Python objects based on an XML file?
XML (Extensible Markup Language) is a markup language used to structure, store, and transfer data between systems. Python developers often need to read and process XML data in their applications.
The untangle library provides a simple way to create Python objects based on an XML file. This small Python library converts an XML document into a Python object with an intuitive API.
Syntax
untangle.parse(filename, **parser_features)
Parameters
-
filename: Can be an XML string, XML filename, or URL
-
parser_features: Extra arguments passed to parser.setFeature()
Return Value
Returns a Python object representing the parsed XML document.
Installation
Install the untangle library using pip or conda:
Using pip:
pip install untangle
Using conda:
conda install -c conda-forge untangle
Example 1: Parsing XML String
Let's parse an XML string and create Python objects ?
import untangle
xml = """<main>
<object1 attr="name">content</object1>
<object1 attr="foo">contenbar</object1>
<test>me</test>
</main>"""
doc = untangle.parse(xml)
obj1 = doc.main.object1
print("Object1 elements:")
print(obj1)
print('-' * 20)
obj2 = doc.main.test
print("Test element:")
print(obj2)
Object1 elements:
[Element(name = object1, attributes = {'attr': 'name'}, cdata = content), Element(name = object1, attributes = {'attr': 'foo'}, cdata = contenbar)]
--------------------
Test element:
Element <test> with attributes {}, children [] and cdata me
Example 2: Parsing XML File
For this example, assume we have an XML file with the following content ?
<?xml version="1.0"?>
<root>
<child name="child1"/>
</root>
import untangle
# Note: This example shows the concept
# In practice, replace with actual file path
xml_content = """<?xml version="1.0"?>
<root>
<child name="child1"/>
</root>"""
obj = untangle.parse(xml_content)
child_name = obj.root.child['name']
print("Child name attribute:", child_name)
Child name attribute: child1
Example 3: Real-World RSS Feed
Let's parse an RSS feed to extract post titles and URLs ?
import untangle
# Parse RSS feed from URL
xml_url = "https://planetpython.org/rss20.xml"
obj = untangle.parse(xml_url)
# Extract first 3 items for demonstration
for i, item in enumerate(obj.rss.channel.item[:3]):
title = item.title.cdata
link = item.link.cdata
print(f"{i+1}. {title}")
print(f" {link}")
print()
Note: The above example requires internet connection to fetch the RSS feed.
Key Features
| Feature | Description |
|---|---|
| Simple API | Just call parse() function |
| Multiple Input Types | Supports strings, files, and URLs |
| Object Access | Access XML elements as Python attributes |
| Attribute Access | Access XML attributes using dictionary syntax |
Conclusion
The untangle library provides a simple and intuitive way to convert XML data into Python objects. It supports parsing from strings, files, and URLs, making it versatile for various XML processing tasks.
