我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

注意:我的输入是一个字符串。输出是一个字符串。

(基本)模拟结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <tag>
    <nested>hello</nested>
  </tag>
</root>

当前回答

如果您确信您有一个有效的XML,那么这个很简单,并且避免了XML DOM树。可能有一些错误,如果你看到任何错误,请评论

public String prettyPrint(String xml) {
            if (xml == null || xml.trim().length() == 0) return "";

            int stack = 0;
            StringBuilder pretty = new StringBuilder();
            String[] rows = xml.trim().replaceAll(">", ">\n").replaceAll("<", "\n<").split("\n");

            for (int i = 0; i < rows.length; i++) {
                    if (rows[i] == null || rows[i].trim().length() == 0) continue;

                    String row = rows[i].trim();
                    if (row.startsWith("<?")) {
                            // xml version tag
                            pretty.append(row + "\n");
                    } else if (row.startsWith("</")) {
                            // closing tag
                            String indent = repeatString("    ", --stack);
                            pretty.append(indent + row + "\n");
                    } else if (row.startsWith("<")) {
                            // starting tag
                            String indent = repeatString("    ", stack++);
                            pretty.append(indent + row + "\n");
                    } else {
                            // tag data
                            String indent = repeatString("    ", stack);
                            pretty.append(indent + row + "\n");
                    }
            }

            return pretty.toString().trim();
    }

其他回答

下面的代码工作得很好

import javax.xml.transform.OutputKeys;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

String formattedXml1 = prettyFormat("<root><child>aaa</child><child/></root>");

public static String prettyFormat(String input) {
    return prettyFormat(input, "2");
}

public static String prettyFormat(String input, String indent) {
    Source xmlInput = new StreamSource(new StringReader(input));
    StringWriter stringWriter = new StringWriter();
    try {
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", indent);
        transformer.transform(xmlInput, new StreamResult(stringWriter));

        String pretty = stringWriter.toString();
        pretty = pretty.replace("\r\n", "\n");
        return pretty;              
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

嗯…面对这样的事情,这是一个已知的bug… 只需添加这个OutputProperty ..

transformer.setOutputProperty(OutputPropertiesFactory.S_KEY_INDENT_AMOUNT, "8");

希望这对你有所帮助……

我在这里找到的针对Java 1.6+的解决方案不会重新格式化已经格式化的代码。对我有效的方法(重新格式化已经格式化的代码)如下。

import org.apache.xml.security.c14n.CanonicalizationException;
import org.apache.xml.security.c14n.Canonicalizer;
import org.apache.xml.security.c14n.InvalidCanonicalizerException;
import org.w3c.dom.Element;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.TransformerException;
import java.io.IOException;
import java.io.StringReader;

public class XmlUtils {
    public static String toCanonicalXml(String xml) throws InvalidCanonicalizerException, ParserConfigurationException, SAXException, CanonicalizationException, IOException {
        Canonicalizer canon = Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_OMIT_COMMENTS);
        byte canonXmlBytes[] = canon.canonicalize(xml.getBytes());
        return new String(canonXmlBytes);
    }

    public static String prettyFormat(String input) throws TransformerException, ParserConfigurationException, IOException, SAXException, InstantiationException, IllegalAccessException, ClassNotFoundException {
        InputSource src = new InputSource(new StringReader(input));
        Element document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
        Boolean keepDeclaration = input.startsWith("<?xml");
        DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
        DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
        LSSerializer writer = impl.createLSSerializer();
        writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
        writer.getDomConfig().setParameter("xml-declaration", keepDeclaration);
        return writer.writeToString(document);
    }
}

它是在单元测试中比较全字符串xml的好工具。

private void assertXMLEqual(String expected, String actual) throws ParserConfigurationException, IOException, SAXException, CanonicalizationException, InvalidCanonicalizerException, TransformerException, IllegalAccessException, ClassNotFoundException, InstantiationException {
    String canonicalExpected = prettyFormat(toCanonicalXml(expected));
    String canonicalActual = prettyFormat(toCanonicalXml(actual));
    assertEquals(canonicalExpected, canonicalActual);
}

下面是一种使用dom4j的方法:

进口:

import org.dom4j.Document;  
import org.dom4j.DocumentHelper;  
import org.dom4j.io.OutputFormat;  
import org.dom4j.io.XMLWriter;

代码:

String xml = "<your xml='here'/>";  
Document doc = DocumentHelper.parseText(xml);  
StringWriter sw = new StringWriter();  
OutputFormat format = OutputFormat.createPrettyPrint();  
XMLWriter xw = new XMLWriter(sw, format);  
xw.write(doc);  
String result = sw.toString();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
// initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

注意:根据Java版本的不同,结果可能有所不同。搜索特定于您的平台的解决方案。