Pretty Printing XML in Python

When dealing with XML data in Python, ensuring its readability and structure can greatly enhance code comprehension and maintainability. Pretty printing XML, or formatting it with proper indentation and line breaks, is a valuable technique for achieving these goals.

In this article, we explore two different methods to pretty print XML using Python: xml.dom.minidom and xml.etree.ElementTree. By understanding these approaches, developers can effectively present XML data in an organized and visually appealing manner, facilitating easier analysis and manipulation.

Method 1: Using xml.dom.minidom

The xml.dom.minidom module provides a lightweight DOM implementation that makes pretty printing straightforward with its built-in toprettyxml() method.

import xml.dom.minidom

def pretty_print_xml_minidom(xml_string):
    # Parse the XML string
    dom = xml.dom.minidom.parseString(xml_string)
    
    # Pretty print the XML with 2-space indentation
    pretty_xml = dom.toprettyxml(indent="  ")
    
    # Remove empty lines that toprettyxml() adds
    pretty_xml = "\n".join(line for line in pretty_xml.split("\n") if line.strip())
    
    return pretty_xml

# Example XML string
xml_string = '<root><person id="1"><name>John</name><age>30</age></person><person id="2"><name>Jane</name><age>25</age></person></root>'

result = pretty_print_xml_minidom(xml_string)
print(result)
<?xml version="1.0" ?>
<root>
  <person id="1">
    <name>John</name>
    <age>30</age>
  </person>
  <person id="2">
    <name>Jane</name>
    <age>25</age>
  </person>
</root>

Method 2: Using xml.etree.ElementTree

The xml.etree.ElementTree module requires a custom indentation function but offers more control over the formatting process.

import xml.etree.ElementTree as ET

def indent(elem, level=0):
    """Recursively add indentation to XML elements"""
    indent_size = "  "
    i = "\n" + level * indent_size
    
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + indent_size
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for child in elem:
            indent(child, level + 1)
        if not child.tail or not child.tail.strip():
            child.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i

def pretty_print_xml_elementtree(xml_string):
    # Parse the XML string
    root = ET.fromstring(xml_string)
    
    # Add indentation
    indent(root)
    
    # Convert back to string
    pretty_xml = ET.tostring(root, encoding="unicode")
    
    return pretty_xml

# Example XML string
xml_string = '<root><person id="1"><name>John</name><age>30</age></person><person id="2"><name>Jane</name><age>25</age></person></root>'

result = pretty_print_xml_elementtree(xml_string)
print(result)
<root>
  <person id="1">
    <name>John</name>
    <age>30</age>
  </person>
  <person id="2">
    <name>Jane</name>
    <age>25</age>
  </person>
</root>

Comparison

Method Pros Cons Best For
xml.dom.minidom Built-in pretty printing, XML declaration included Adds empty lines, higher memory usage Quick formatting with minimal code
xml.etree.ElementTree More control, memory efficient Requires custom indentation function Large XML files, custom formatting needs

Conclusion

Pretty printing XML in Python is essential for improving readability and debugging. Use xml.dom.minidom for simple cases with its built-in toprettyxml() method, or choose xml.etree.ElementTree when you need more control over the formatting process.

Updated on: 2026-03-27T09:49:03+05:30

10K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements