我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?
String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);
注意:我的输入是一个字符串。输出是一个字符串。
(基本)模拟结果:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<tag>
<nested>hello</nested>
</tag>
</root>
以上所有的解决方案都不适合我,然后我找到了这个http://myshittycode.com/2014/02/10/java-properly-indenting-xml-string/
线索就是用XPath删除空格
String xml = "<root>" +
"\n " +
"\n<name>Coco Puff</name>" +
"\n <total>10</total> </root>";
try {
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
document,
XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
Node node = nodeList.item(i);
node.getParentNode().removeChild(node);
}
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
StringWriter stringWriter = new StringWriter();
StreamResult streamResult = new StreamResult(stringWriter);
transformer.transform(new DOMSource(document), streamResult);
System.out.println(stringWriter.toString());
}
catch (Exception e) {
e.printStackTrace();
}
除了max、codeskrap、David Easley和milosmns给出的答案外,还可以看看我的轻量级、高性能漂亮打印机库:xml-formatter
// construct lightweight, threadsafe, instance
PrettyPrinter prettyPrinter = PrettyPrinterBuilder.newPrettyPrinter().build();
StringBuilder buffer = new StringBuilder();
String xml = ..; // also works with char[] or Reader
if(prettyPrinter.process(xml, buffer)) {
// valid XML, print buffer
} else {
// invalid XML, print xml
}
有时,就像直接从文件运行模拟SOAP服务时,有一个漂亮的打印机也能处理已经打印好的XML是很好的:
PrettyPrinter prettyPrinter = PrettyPrinterBuilder.newPrettyPrinter().ignoreWhitespace().build();
正如一些人评论的那样,漂亮打印只是一种以更适合人类阅读的形式表示XML的方法——严格来说,XML数据中不应该有空格。
该库用于日志记录的漂亮打印,还包括用于过滤(子树移除/匿名化)和漂亮打印CDATA和Text节点中的XML的函数。
Since you are starting with a String, you can convert to a DOM object (e.g. Node) before you use the Transformer. However, if you know your XML string is valid, and you don't want to incur the memory overhead of parsing a string into a DOM, then running a transform over the DOM to get a string back - you could just do some old fashioned character by character parsing. Insert a newline and spaces after every </...> characters, keep and indent counter (to determine the number of spaces) that you increment for every <...> and decrement for every </...> you see.
免责声明-我对下面的函数做了剪切/粘贴/文本编辑,所以它们可能不能按原样编译。
public static final Element createDOM(String strXML)
throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(true);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource sourceXML = new InputSource(new StringReader(strXML));
Document xmlDoc = db.parse(sourceXML);
Element e = xmlDoc.getDocumentElement();
e.normalize();
return e;
}
public static final void prettyPrint(Node xml, OutputStream out)
throws TransformerConfigurationException, TransformerFactoryConfigurationError, TransformerException {
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tf.setOutputProperty(OutputKeys.INDENT, "yes");
tf.transform(new DOMSource(xml), new StreamResult(out));
}