我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

注意:我的输入是一个字符串。输出是一个字符串。

(基本)模拟结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <tag>
    <nested>hello</nested>
  </tag>
</root>

当前回答

以上所有的解决方案都不适合我,然后我找到了这个http://myshittycode.com/2014/02/10/java-properly-indenting-xml-string/

线索就是用XPath删除空格

    String xml = "<root>" +
             "\n   " +
             "\n<name>Coco Puff</name>" +
             "\n        <total>10</total>    </root>";

try {
    Document document = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder()
            .parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));

    XPath xPath = XPathFactory.newInstance().newXPath();
    NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
                                                  document,
                                                  XPathConstants.NODESET);

    for (int i = 0; i < nodeList.getLength(); ++i) {
        Node node = nodeList.item(i);
        node.getParentNode().removeChild(node);
    }

    Transformer transformer = TransformerFactory.newInstance().newTransformer();
    transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    transformer.setOutputProperty(OutputKeys.INDENT, "yes");
    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

    StringWriter stringWriter = new StringWriter();
    StreamResult streamResult = new StreamResult(stringWriter);

    transformer.transform(new DOMSource(document), streamResult);

    System.out.println(stringWriter.toString());
}
catch (Exception e) {
    e.printStackTrace();
}

其他回答

如果你不需要缩进那么多,但一些换行,这可能是足够的简单regex…

String leastPrettifiedXml = uglyXml.replaceAll("><", ">\n<");

代码很好,而不是因为缺少缩进而导致的结果。


(对于有缩进的解,请参见其他答案。)

使用jdom2: http://www.jdom.org/

import java.io.StringReader;
import org.jdom2.input.SAXBuilder;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;

String prettyXml = new XMLOutputter(Format.getPrettyFormat()).
                         outputString(new SAXBuilder().build(new StringReader(uglyXml)));

java有一个静态方法U.formatXml(string)。生活的例子

import com.github.underscore.U;

public class MyClass {
    public static void main(String args[]) {
        String xml = "<tag><nested>hello</nested></tag>";

        System.out.println(U.formatXml("<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>" + xml + "</root>"));
    }
}

输出:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <tag>
      <nested>hello</nested>
   </tag>
</root>

有一个非常好的命令行XML实用程序叫做xmlstarlet(http://xmlstar.sourceforge.net/),它可以做很多事情,很多人都在使用它。

您可以使用Runtime以编程方式执行此程序。然后读入格式化的输出文件。它具有比几行Java代码所能提供的更多选项和更好的错误报告。

下载xmlstarlet: http://sourceforge.net/project/showfiles.php?group_id=66612&package_id=64589

Since you are starting with a String, you can convert to a DOM object (e.g. Node) before you use the Transformer. However, if you know your XML string is valid, and you don't want to incur the memory overhead of parsing a string into a DOM, then running a transform over the DOM to get a string back - you could just do some old fashioned character by character parsing. Insert a newline and spaces after every </...> characters, keep and indent counter (to determine the number of spaces) that you increment for every <...> and decrement for every </...> you see.

免责声明-我对下面的函数做了剪切/粘贴/文本编辑,所以它们可能不能按原样编译。

public static final Element createDOM(String strXML) 
    throws ParserConfigurationException, SAXException, IOException {

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setValidating(true);
    DocumentBuilder db = dbf.newDocumentBuilder();
    InputSource sourceXML = new InputSource(new StringReader(strXML));
    Document xmlDoc = db.parse(sourceXML);
    Element e = xmlDoc.getDocumentElement();
    e.normalize();
    return e;
}

public static final void prettyPrint(Node xml, OutputStream out)
    throws TransformerConfigurationException, TransformerFactoryConfigurationError, TransformerException {
    Transformer tf = TransformerFactory.newInstance().newTransformer();
    tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    tf.transform(new DOMSource(xml), new StreamResult(out));
}