我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?
String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);
注意:我的输入是一个字符串。输出是一个字符串。
(基本)模拟结果:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<tag>
<nested>hello</nested>
</tag>
</root>
I have found that in Java 1.6.0_32 the normal method to pretty print an XML string (using a Transformer with a null or identity xslt) does not behave as I would like if tags are merely separated by whitespace, as opposed to having no separating text. I tried using <xsl:strip-space elements="*"/> in my template to no avail. The simplest solution I found was to strip the space the way I wanted using a SAXSource and XML filter. Since my solution was for logging I also extended this to work with incomplete XML fragments. Note the normal method seems to work fine if you use a DOMSource but I did not want to use this because of the incompleteness and memory overhead.
public static class WhitespaceIgnoreFilter extends XMLFilterImpl
{
@Override
public void ignorableWhitespace(char[] arg0,
int arg1,
int arg2) throws SAXException
{
//Ignore it then...
}
@Override
public void characters( char[] ch,
int start,
int length) throws SAXException
{
if (!new String(ch, start, length).trim().equals(""))
super.characters(ch, start, length);
}
}
public static String prettyXML(String logMsg, boolean allowBadlyFormedFragments) throws SAXException, IOException, TransformerException
{
TransformerFactory transFactory = TransformerFactory.newInstance();
transFactory.setAttribute("indent-number", new Integer(2));
Transformer transformer = transFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
StringWriter out = new StringWriter();
XMLReader masterParser = SAXHelper.getSAXParser(true);
XMLFilter parser = new WhitespaceIgnoreFilter();
parser.setParent(masterParser);
if(allowBadlyFormedFragments)
{
transformer.setErrorListener(new ErrorListener()
{
@Override
public void warning(TransformerException exception) throws TransformerException
{
}
@Override
public void fatalError(TransformerException exception) throws TransformerException
{
}
@Override
public void error(TransformerException exception) throws TransformerException
{
}
});
}
try
{
transformer.transform(new SAXSource(parser, new InputSource(new StringReader(logMsg))), new StreamResult(out));
}
catch (TransformerException e)
{
if(e.getCause() != null && e.getCause() instanceof SAXParseException)
{
if(!allowBadlyFormedFragments || !"XML document structures must start and end within the same entity.".equals(e.getCause().getMessage()))
{
throw e;
}
}
else
{
throw e;
}
}
out.flush();
return out.toString();
}
如果您确信您有一个有效的XML,那么这个很简单,并且避免了XML DOM树。可能有一些错误,如果你看到任何错误,请评论
public String prettyPrint(String xml) {
if (xml == null || xml.trim().length() == 0) return "";
int stack = 0;
StringBuilder pretty = new StringBuilder();
String[] rows = xml.trim().replaceAll(">", ">\n").replaceAll("<", "\n<").split("\n");
for (int i = 0; i < rows.length; i++) {
if (rows[i] == null || rows[i].trim().length() == 0) continue;
String row = rows[i].trim();
if (row.startsWith("<?")) {
// xml version tag
pretty.append(row + "\n");
} else if (row.startsWith("</")) {
// closing tag
String indent = repeatString(" ", --stack);
pretty.append(indent + row + "\n");
} else if (row.startsWith("<")) {
// starting tag
String indent = repeatString(" ", stack++);
pretty.append(indent + row + "\n");
} else {
// tag data
String indent = repeatString(" ", stack);
pretty.append(indent + row + "\n");
}
}
return pretty.toString().trim();
}
我把它们混合在一起,写了一个小程序。它从xml文件中读取并打印出来。而不是xzy给出你的文件路径。
public static void main(String[] args) throws Exception {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new FileInputStream(new File("C:/Users/xyz.xml")));
prettyPrint(doc);
}
private static String prettyPrint(Document document)
throws TransformerException {
TransformerFactory transformerFactory = TransformerFactory
.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
DOMSource source = new DOMSource(document);
StringWriter strWriter = new StringWriter();
StreamResult result = new StreamResult(strWriter);transformer.transform(source, result);
System.out.println(strWriter.getBuffer().toString());
return strWriter.getBuffer().toString();
}
请注意,排名靠前的答案需要使用xerces。
如果您不想添加这个外部依赖,那么您可以简单地使用标准jdk库(实际上是在内部使用xerces构建的)。
注意:jdk 1.5版本有一个bug,请参阅http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6296446,但现在已经解决了。
(注意,如果发生错误,将返回原始文本)
package com.test;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.sax.SAXSource;
import javax.xml.transform.sax.SAXTransformerFactory;
import javax.xml.transform.stream.StreamResult;
import org.xml.sax.InputSource;
public class XmlTest {
public static void main(String[] args) {
XmlTest t = new XmlTest();
System.out.println(t.formatXml("<a><b><c/><d>text D</d><e value='0'/></b></a>"));
}
public String formatXml(String xml){
try{
Transformer serializer= SAXTransformerFactory.newInstance().newTransformer();
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
//serializer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
serializer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
//serializer.setOutputProperty("{http://xml.customer.org/xslt}indent-amount", "2");
Source xmlSource=new SAXSource(new InputSource(new ByteArrayInputStream(xml.getBytes())));
StreamResult res = new StreamResult(new ByteArrayOutputStream());
serializer.transform(xmlSource, res);
return new String(((ByteArrayOutputStream)res.getOutputStream()).toByteArray());
}catch(Exception e){
//TODO log error
return xml;
}
}
}