- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Pretty Printing XML in Python
When dealing with XML data in Python, ensuring its readability and structure can greatly enhance code comprehension and maintainability. Pretty printing XML, or formatting it with proper indentation and line breaks, is a valuable technique for achieving these goals.
In this article, we explore two different methods to pretty print XML using Python: xml.dom.minidom and xml.etree.ElementTree. By understanding these approaches, developers can effectively present XML data in an organized and visually appealing manner, facilitating easier analysis and manipulation.
How to Pretty print XML in Python?
Below are the two methods using which we can perform pretty printing in Python −
Method 1: Using xml.dom.minidom
Below are the steps that we will follow to perform pretty printing using xml.dom.minidom −
Import the required module: We start by importing the `xml.dom.minidom` module, which provides a lightweight implementation of the Document Object Model (DOM) API for XML parsing.
Define the `pretty_print_xml_minidom` function: This function takes an XML string as input and is responsible for parsing and pretty printing the XML using `xml.dom.minidom`.
Parse the XML string: Inside the `pretty_print_xml_minidom` function, we use the `xml.dom.minidom.parseString()` method to parse the XML string and create a DOM object.
Pretty print the XML: Next, we use the `toprettyxml()` method on the DOM object to generate a pretty-printed XML string. We pass the `indent` parameter with a value of `" "` to specify the desired indentation level (two spaces in this case).
Remove empty lines: By default, `toprettyxml()` adds empty lines in the output. To remove these empty lines, we split the pretty XML string by newlines (`\n`), remove any leading or trailing whitespace from each line, and then join the non-empty lines back together.
Print the pretty XML: Finally, we print the resulting pretty XML string.
Program 2: Using xml.etree.ElementTree
Below are the steps that we will be following to perform pretty printing using xml.etree.ElementTree −
Import the required module: We start by importing the `xml.etree.ElementTree` module, which provides a fast and efficient API for parsing and manipulating XML.
Define the `indent` function: This is a custom function that recursively adds indentation to the XML elements. It takes an `elem` parameter (XML element) and an optional `level` parameter to specify the current indentation level (defaulted to 0).
Indent the XML: Inside the `indent` function, we add indentation by modifying the `text` and `tail` attributes of the XML elements. The `text` attribute represents the text immediately after the opening tag, and the `tail` attribute represents the text immediately before the closing tag. By adding indentation to these attributes, we achieve pretty printing.
Define the `pretty_print_xml_elementtree` function: This function takes an XML string as input and is responsible for parsing and pretty printing the XML using `xml.etree.ElementTree`.
Parse the XML string: Inside the `pretty_print_xml_elementtree` function, we use the `ET.fromstring()` method to parse the XML string and create an ElementTree object.
Indent the XML: We call the `indent()` function on the root element of the XML to add the indentation recursively to all elements.
Convert the XML element back to a string: We use the `ET.tostring()` method to convert the XML element back to a string representation. We pass the `encoding` parameter with a value of `"unicode"` to ensure proper encoding of the resulting string.
Print the pretty XML: Finally, we print the resulting pretty XML string.
Both programs provide different approaches to achieve pretty printing of XML. The first program utilizes the DOM API provided by `xml.dom.minidom` to parse and pretty print XML, while the second program uses the `xml.etree.ElementTree` module and defines a custom function to add indentation recursively to XML elements.
Below are the program examples using the above two methods −
Program 1: Using xml.dom.minidom
import xml.dom.minidom def pretty_print_xml_minidom(xml_string): # Parse the XML string dom = xml.dom.minidom.parseString(xml_string) # Pretty print the XML pretty_xml = dom.toprettyxml(indent=" ") # Remove empty lines pretty_xml = "\n".join(line for line in pretty_xml.split("\n") if line.strip()) # Print the pretty XML print(pretty_xml) # Example usage xml_string = ''' <root> <element attribute="value"> <subelement>Text</subelement> </element> </root> ''' pretty_print_xml_minidom(xml_string)
Output
<?xml version="1.0" ?> <root> <element attribute="value"> <subelement>Text</subelement> </element> </root>
Program 2: Using xml.etree.ElementTree
import xml.etree.ElementTree as ET def indent(elem, level=0): # Add indentation indent_size = " " i = "\n" + level * indent_size if len(elem): if not elem.text or not elem.text.strip(): elem.text = i + indent_size if not elem.tail or not elem.tail.strip(): elem.tail = i for elem in elem: indent(elem, level + 1) if not elem.tail or not elem.tail.strip(): elem.tail = i else: if level and (not elem.tail or not elem.tail.strip()): elem.tail = i def pretty_print_xml_elementtree(xml_string): # Parse the XML string root = ET.fromstring(xml_string) # Indent the XML indent(root) # Convert the XML element back to a string pretty_xml = ET.tostring(root, encoding="unicode") # Print the pretty XML print(pretty_xml) # Example usage xml_string = ''' <root> <element attribute="value"> <subelement>Text</subelement> </element> </root> ''' pretty_print_xml_elementtree(xml_string)
Output
<root> <element attribute="value"> <subelement>Text</subelement> </element> </root>
Conclusion
In cocnclusion, Pretty printing XML in Python is essential for improving the readability and structure of XML data. Whether using the xml.dom.minidom or xml.etree.ElementTree library, developers can easily format XML with proper indentation. By adopting these techniques, programmers can enhance code comprehension, simplify debugging, and promote better collaboration when working with XML data in Python projects.