Fast XML parsing using Expat in Python


Python allows XML data to be read and processed through its inbuilt module called expat. It is a non-validating XML parser. it creates an XML parser object and captures the attributes of its objects into various handler functions. In the below example we will see how the various handler functions can help us read the XML file as well as give the attribute values as the output data. This generated data can be used for the processing.

Example

import xml.parsers.expat
# Capture the first element
def first_element(tag, attrs):
   print ('first element:', tag, attrs)
# Capture the last element
def last_element(tag):
   print ('last element:', tag)
# Capture the character Data
def character_value(value):
   print ('Character value:', repr(value))
parser_expat = xml.parsers.expat.ParserCreate()
parser_expat.StartElementHandler = first_element
parser_expat.EndElementHandler = last_element
parser_expat.CharacterDataHandler = character_value
parser_expat.Parse(""" <?xml version="1.0"?>
<parent student_rollno="15">
<child1 Student_name="Krishna"> Strive for progress, not perfection</child1>
<child2 student_name="vamsi"> There are no shortcuts to any place worth going</child2>
</parent>""", 1)

Output

Running the above code gives us the following result −

first element: parent {'student_rollno': '15'}
Character value: '\n'
first element: child1 {'Student_name': 'Krishna'}
Character value: 'Strive for progress, not perfection'
last element: child1
Character value: '\n'
first element: child2 {'student_name': 'vamsi'}
Character value: ' There are no shortcuts to any place worth going'
last element: child2
Character value: '\n'
last element: parent

Updated on: 04-Feb-2020

782 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements