I am parsing an XML file in Python3 using lxml.objectify:
<root> <object_header></object_header> <object_details></object_details> <object_details></object_details> <object_header></object_header> <object_details></object_details> <object_header></object_header>
Note that sometimes the object does not have attributes.
The way I am currently parsing this (which works but is inelegant) is by the following:
from lxml import objectify, etree
root = objectify.parse(xmlFile).getroot()
elems = [el for el in root.iterchildren()]
# data is list of objects
data = 
# Have to instantiate outside of for loop in case last object has not details.
objectDetails = ''
# Don't store first object right away.
firstObject = True
# Iterate through each XML element.
for elem in elems: if elem.tag == 'object_header': # Remember object header info. object = storeHeaderInfo(objectDetails) # Skip saving if first object, need to grab object details. if firstObject == True: # Don't skip again, in case object has no details. firstObject = False continue # Save object, already grabbed object details. data.append(object) else: # Process object details in <object_details> tag. objectDetails += etree.tostring(elem)
# Save last object.
object = storeHeaderInfo(objectDetails)
What I don't like is how I have to code storing the object twice. Once for each object in the for loop, and then again for the last object.
Is there a more pythonic or elegant way of doing this?