用Python打印XML的最佳方法(或各种方法)是什么?


当前回答

import xml.dom.minidom

dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()

其他回答

我发现了一个快速简单的方法来格式化和打印一个xml文件:

import xml.etree.ElementTree as ET

xmlTree = ET.parse('your XML file')
xmlRoot = xmlTree.getroot()
xmlDoc =  ET.tostring(xmlRoot, encoding="unicode")

print(xmlDoc)

Outuput:

<root>
  <child>
    <subchild>.....</subchild>
  </child>
  <child>
    <subchild>.....</subchild>
  </child>
  ...
  ...
  ...
  <child>
    <subchild>.....</subchild>
  </child>
</root>

我编写了一个解决方案来遍历现有的ElementTree,并使用text/tail将其缩进。

def prettify(element, indent='  '):
    queue = [(0, element)]  # (level, element)
    while queue:
        level, element = queue.pop(0)
        children = [(level + 1, child) for child in list(element)]
        if children:
            element.text = '\n' + indent * (level+1)  # for child open
        if queue:
            element.tail = '\n' + indent * queue[0][0]  # for sibling open
        else:
            element.tail = '\n' + indent * (level-1)  # for parent close
        queue[0:0] = children  # prepend so children come before siblings

我用几行代码解决了这个问题,打开文件,通过它并添加缩进,然后再次保存它。我使用的是小的xml文件,不想添加依赖项,也不想为用户安装更多的库。总之,这是我最后得出的结论:

    f = open(file_name,'r')
    xml = f.read()
    f.close()

    #Removing old indendations
    raw_xml = ''        
    for line in xml:
        raw_xml += line

    xml = raw_xml

    new_xml = ''
    indent = '    '
    deepness = 0

    for i in range((len(xml))):

        new_xml += xml[i]   
        if(i<len(xml)-3):

            simpleSplit = xml[i:(i+2)] == '><'
            advancSplit = xml[i:(i+3)] == '></'        
            end = xml[i:(i+2)] == '/>'    
            start = xml[i] == '<'

            if(advancSplit):
                deepness += -1
                new_xml += '\n' + indent*deepness
                simpleSplit = False
                deepness += -1
            if(simpleSplit):
                new_xml += '\n' + indent*deepness
            if(start):
                deepness += 1
            if(end):
                deepness += -1

    f = open(file_name,'w')
    f.write(new_xml)
    f.close()

这对我来说很有用,也许有人会用到它:)

如果不想重新解析,还有一个备选方法,即带有get_pprint()函数的xmlpp.py库。它在我的用例中工作得很好,很顺利,不需要重新解析为lxml ElementTree对象。

我遇到了这个问题,我是这样解决的:

def write_xml_file (self, file, xml_root_element, xml_declaration=False, pretty_print=False, encoding='unicode', indent='\t'):
    pretty_printed_xml = etree.tostring(xml_root_element, xml_declaration=xml_declaration, pretty_print=pretty_print, encoding=encoding)
    if pretty_print: pretty_printed_xml = pretty_printed_xml.replace('  ', indent)
    file.write(pretty_printed_xml)

在我的代码中,这个方法是这样调用的:

try:
    with open(file_path, 'w') as file:
        file.write('<?xml version="1.0" encoding="utf-8" ?>')

        # create some xml content using etree ...

        xml_parser = XMLParser()
        xml_parser.write_xml_file(file, xml_root, xml_declaration=False, pretty_print=True, encoding='unicode', indent='\t')

except IOError:
    print("Error while writing in log file!")

这只是因为etree默认使用两个空格来缩进,我发现这不是很强调缩进,因此不漂亮。我找不到任何树的设置或任何函数的参数来改变标准树缩进。我喜欢使用etree的简单性,但这真的让我很恼火。