我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

注意:我的输入是一个字符串。输出是一个字符串。

(基本)模拟结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <tag>
    <nested>hello</nested>
  </tag>
</root>

当前回答

下面是一种使用dom4j的方法:

进口:

import org.dom4j.Document;  
import org.dom4j.DocumentHelper;  
import org.dom4j.io.OutputFormat;  
import org.dom4j.io.XMLWriter;

代码:

String xml = "<your xml='here'/>";  
Document doc = DocumentHelper.parseText(xml);  
StringWriter sw = new StringWriter();  
OutputFormat format = OutputFormat.createPrettyPrint();  
XMLWriter xw = new XMLWriter(sw, format);  
xw.write(doc);  
String result = sw.toString();

其他回答

Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
// initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

注意:根据Java版本的不同,结果可能有所不同。搜索特定于您的平台的解决方案。

下面是一种使用dom4j的方法:

进口:

import org.dom4j.Document;  
import org.dom4j.DocumentHelper;  
import org.dom4j.io.OutputFormat;  
import org.dom4j.io.XMLWriter;

代码:

String xml = "<your xml='here'/>";  
Document doc = DocumentHelper.parseText(xml);  
StringWriter sw = new StringWriter();  
OutputFormat format = OutputFormat.createPrettyPrint();  
XMLWriter xw = new XMLWriter(sw, format);  
xw.write(doc);  
String result = sw.toString();

Since you are starting with a String, you can convert to a DOM object (e.g. Node) before you use the Transformer. However, if you know your XML string is valid, and you don't want to incur the memory overhead of parsing a string into a DOM, then running a transform over the DOM to get a string back - you could just do some old fashioned character by character parsing. Insert a newline and spaces after every </...> characters, keep and indent counter (to determine the number of spaces) that you increment for every <...> and decrement for every </...> you see.

免责声明-我对下面的函数做了剪切/粘贴/文本编辑,所以它们可能不能按原样编译。

public static final Element createDOM(String strXML) 
    throws ParserConfigurationException, SAXException, IOException {

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setValidating(true);
    DocumentBuilder db = dbf.newDocumentBuilder();
    InputSource sourceXML = new InputSource(new StringReader(strXML));
    Document xmlDoc = db.parse(sourceXML);
    Element e = xmlDoc.getDocumentElement();
    e.normalize();
    return e;
}

public static final void prettyPrint(Node xml, OutputStream out)
    throws TransformerConfigurationException, TransformerFactoryConfigurationError, TransformerException {
    Transformer tf = TransformerFactory.newInstance().newTransformer();
    tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    tf.transform(new DOMSource(xml), new StreamResult(out));
}

为了将来的参考,这里有一个对我有用的解决方案(感谢@George Hawkins在其中一个答案中发表的评论):

DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
LSSerializer writer = impl.createLSSerializer();
writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
LSOutput output = impl.createLSOutput();
ByteArrayOutputStream out = new ByteArrayOutputStream();
output.setByteStream(out);
writer.write(document, output);
String xmlStr = new String(out.toByteArray());

我试图实现类似的东西,但没有任何外部依赖。应用程序已经在使用DOM来格式化xml了!

下面是我的示例片段

public void formatXML(final String unformattedXML) {
    final int length = unformattedXML.length();
    final int indentSpace = 3;
    final StringBuilder newString = new StringBuilder(length + length / 10);
    final char space = ' ';
    int i = 0;
    int indentCount = 0;
    char currentChar = unformattedXML.charAt(i++);
    char previousChar = currentChar;
    boolean nodeStarted = true;
    newString.append(currentChar);
    for (; i < length - 1;) {
        currentChar = unformattedXML.charAt(i++);
        if(((int) currentChar < 33) && !nodeStarted) {
            continue;
        }
        switch (currentChar) {
        case '<':
            if ('>' == previousChar && '/' != unformattedXML.charAt(i - 1) && '/' != unformattedXML.charAt(i) && '!' != unformattedXML.charAt(i)) {
                indentCount++;
            }
            newString.append(System.lineSeparator());
            for (int j = indentCount * indentSpace; j > 0; j--) {
                newString.append(space);
            }
            newString.append(currentChar);
            nodeStarted = true;
            break;
        case '>':
            newString.append(currentChar);
            nodeStarted = false;
            break;
        case '/':
            if ('<' == previousChar || '>' == unformattedXML.charAt(i)) {
                indentCount--;
            }
            newString.append(currentChar);
            break;
        default:
            newString.append(currentChar);
        }
        previousChar = currentChar;
    }
    newString.append(unformattedXML.charAt(length - 1));
    System.out.println(newString.toString());
}