我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

注意:我的输入是一个字符串。输出是一个字符串。

(基本)模拟结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <tag>
    <nested>hello</nested>
  </tag>
</root>

当前回答

以上所有的解决方案都不适合我,然后我找到了这个http://myshittycode.com/2014/02/10/java-properly-indenting-xml-string/

线索就是用XPath删除空格

    String xml = "<root>" +
             "\n   " +
             "\n<name>Coco Puff</name>" +
             "\n        <total>10</total>    </root>";

try {
    Document document = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder()
            .parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));

    XPath xPath = XPathFactory.newInstance().newXPath();
    NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
                                                  document,
                                                  XPathConstants.NODESET);

    for (int i = 0; i < nodeList.getLength(); ++i) {
        Node node = nodeList.item(i);
        node.getParentNode().removeChild(node);
    }

    Transformer transformer = TransformerFactory.newInstance().newTransformer();
    transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    transformer.setOutputProperty(OutputKeys.INDENT, "yes");
    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

    StringWriter stringWriter = new StringWriter();
    StreamResult streamResult = new StreamResult(stringWriter);

    transformer.transform(new DOMSource(document), streamResult);

    System.out.println(stringWriter.toString());
}
catch (Exception e) {
    e.printStackTrace();
}

其他回答

我用Scala看到了一个答案,所以这里有另一个用Groovy的答案,以防有人觉得有趣。默认缩进为2步,XmlNodePrinter构造函数也可以传递另一个值。

def xml = "<tag><nested>hello</nested></tag>"
def stringWriter = new StringWriter()
def node = new XmlParser().parseText(xml);
new XmlNodePrinter(new PrintWriter(stringWriter)).print(node)
println stringWriter.toString()

如果groovy jar在类路径中,则使用Java

  String xml = "<tag><nested>hello</nested></tag>";
  StringWriter stringWriter = new StringWriter();
  Node node = new XmlParser().parseText(xml);
  new XmlNodePrinter(new PrintWriter(stringWriter)).print(node);
  System.out.println(stringWriter.toString());

使用scala:

import xml._
val xml = XML.loadString("<tag><nested>hello</nested></tag>")
val formatted = new PrettyPrinter(150, 2).format(xml)
println(formatted)

如果你依赖scala-library.jar,你也可以在Java中这样做。它是这样的:

import scala.xml.*;

public class FormatXML {
    public static void main(String[] args) {
        String unformattedXml = "<tag><nested>hello</nested></tag>";
        PrettyPrinter pp = new PrettyPrinter(150, 3);
        String formatted = pp.format(XML.loadString(unformattedXml), TopScope$.MODULE$);
        System.out.println(formatted);
    }
}

PrettyPrinter对象是用两个整数构造的,第一个是最大行长,第二个是缩进步骤。

如果您确信您有一个有效的XML,那么这个很简单,并且避免了XML DOM树。可能有一些错误,如果你看到任何错误,请评论

public String prettyPrint(String xml) {
            if (xml == null || xml.trim().length() == 0) return "";

            int stack = 0;
            StringBuilder pretty = new StringBuilder();
            String[] rows = xml.trim().replaceAll(">", ">\n").replaceAll("<", "\n<").split("\n");

            for (int i = 0; i < rows.length; i++) {
                    if (rows[i] == null || rows[i].trim().length() == 0) continue;

                    String row = rows[i].trim();
                    if (row.startsWith("<?")) {
                            // xml version tag
                            pretty.append(row + "\n");
                    } else if (row.startsWith("</")) {
                            // closing tag
                            String indent = repeatString("    ", --stack);
                            pretty.append(indent + row + "\n");
                    } else if (row.startsWith("<")) {
                            // starting tag
                            String indent = repeatString("    ", stack++);
                            pretty.append(indent + row + "\n");
                    } else {
                            // tag data
                            String indent = repeatString("    ", stack);
                            pretty.append(indent + row + "\n");
                    }
            }

            return pretty.toString().trim();
    }
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
// initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

注意:根据Java版本的不同,结果可能有所不同。搜索特定于您的平台的解决方案。

Since you are starting with a String, you can convert to a DOM object (e.g. Node) before you use the Transformer. However, if you know your XML string is valid, and you don't want to incur the memory overhead of parsing a string into a DOM, then running a transform over the DOM to get a string back - you could just do some old fashioned character by character parsing. Insert a newline and spaces after every </...> characters, keep and indent counter (to determine the number of spaces) that you increment for every <...> and decrement for every </...> you see.

免责声明-我对下面的函数做了剪切/粘贴/文本编辑,所以它们可能不能按原样编译。

public static final Element createDOM(String strXML) 
    throws ParserConfigurationException, SAXException, IOException {

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setValidating(true);
    DocumentBuilder db = dbf.newDocumentBuilder();
    InputSource sourceXML = new InputSource(new StringReader(strXML));
    Document xmlDoc = db.parse(sourceXML);
    Element e = xmlDoc.getDocumentElement();
    e.normalize();
    return e;
}

public static final void prettyPrint(Node xml, OutputStream out)
    throws TransformerConfigurationException, TransformerFactoryConfigurationError, TransformerException {
    Transformer tf = TransformerFactory.newInstance().newTransformer();
    tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    tf.transform(new DOMSource(xml), new StreamResult(out));
}