How to create Python objects based on an XML file?


XML (Extensible Markup Language), which is a markup−language that is used to structure, store, and transfer data between systems. At some point we need to read/write the XML data using the Python language.

By using the untangle library we can create Python objects based on an XML file. The untangle is a small Python library which converts an XML document to a Python object.

The untangle has a very simple API. We just need to call the parse() function to get a Python object.

Syntax

untangle.parse(filename, **parser_features)

Parameters

  • Filename: it can be a XML string, a XML filename, or a URL

  • **parser_features: Extra arguments which are treated as feature values to pass to parser.setFeature().

Return Value

The function parses it and returns a Python object which represents the given XML document.

Installation of Untangle Library

To use the untangle.parse() function we need to install the library first. By using the below commands we can install the library.

Installing using pip

pip install untangle

Installing using anaconda

conda install -c conda-forge untangle

Example

Let’s take an XML string and create Python objects.

xml = """<main>
<object1 attr="name">content</object1>
<object1 attr="foo">contenbar</object1>
<test>me</test>
</main>"""

import untangle
doc = untangle.parse(xml) # reading XML string data
obj1 = doc.main.object1
print(obj1)
print('-------------')
obj2 = doc.main.test
print(obj2)

Output

[Element(name = object1, attributes = {'attr': 'name'}, cdata = content), Element(name = object1, attributes = {'attr': 'foo'}, cdata = contenbar)]
-------------
Element <test> with attributes {}, children [] and cdata me

Using the untangle library we have successfully converted the XML data into a python object. Take an XML file (file.xml) and convert it to python objects using the untangle module. Data in XML file look likes below:

<?xml version="1.0"?>
<root>
    <child name="child1"/>
</root>

Now, read the above xml file to create a python object.

import untangle
obj = untangle.parse('path/to/file.xml')
obj.root.child['name'] 

After creating the python object for XML data, we can get the child elements like above.

Output

'child1'

Example

Let’s take a real−world example, from the RSS feed of Planet Python, then extract the post titles and their URLs.

import untangle

xml = "https://planetpython.org/rss20.xml" 
obj = untangle.parse(xml) 

for item in obj.rss.channel.item:
    title = item.title.cdata
    link = item.link.cdata
    print(title)
    print('   ', link) 

Output

IslandT: Python Tutorial -- Chapter 4
    https://islandtropicaman.com/wp/2022/09/15/python-tutorial-chapter-4/
Tryton News: Debian integration packages for Tryton
    https://discuss.tryton.org/t/debian-integration-packages-for-tryton/5531
Python Does What?!: Mock Everything
    https://www.pythondoeswhat.com/2022/09/mock-everything.html
The Python Coding Blog: Functions in Python are Like a Coffee Machine
    https://thepythoncodingbook.com/2022/09/14/functions-in-python-are-like-coffee-machines/
Real Python: How to Replace a String in Python
    https://realpython.com/replace-string-python/
Python for Beginners: Select Row From a Dataframe in Python
    https://www.pythonforbeginners.com/basics/select-row-from-a-dataframe-in-python
PyCoder's Weekly: Issue #542 (Sept. 13, 2022)
    https://pycoders.com/issues/542 ……………………………

In this example, we send the url of XML data to the parse function, and then iterate the elements using the for loop.

Updated on: 24-Aug-2023

772 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements