我有一个包含XML的Java字符串,没有换行或缩进。我想把它变成一个字符串与格式良好的XML。我怎么做呢?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

注意:我的输入是一个字符串。输出是一个字符串。

(基本)模拟结果:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <tag>
    <nested>hello</nested>
  </tag>
</root>

当前回答

使用scala:

import xml._
val xml = XML.loadString("<tag><nested>hello</nested></tag>")
val formatted = new PrettyPrinter(150, 2).format(xml)
println(formatted)

如果你依赖scala-library.jar,你也可以在Java中这样做。它是这样的:

import scala.xml.*;

public class FormatXML {
    public static void main(String[] args) {
        String unformattedXml = "<tag><nested>hello</nested></tag>";
        PrettyPrinter pp = new PrettyPrinter(150, 3);
        String formatted = pp.format(XML.loadString(unformattedXml), TopScope$.MODULE$);
        System.out.println(formatted);
    }
}

PrettyPrinter对象是用两个整数构造的,第一个是最大行长,第二个是缩进步骤。

其他回答

我试图实现类似的东西,但没有任何外部依赖。应用程序已经在使用DOM来格式化xml了!

下面是我的示例片段

public void formatXML(final String unformattedXML) {
    final int length = unformattedXML.length();
    final int indentSpace = 3;
    final StringBuilder newString = new StringBuilder(length + length / 10);
    final char space = ' ';
    int i = 0;
    int indentCount = 0;
    char currentChar = unformattedXML.charAt(i++);
    char previousChar = currentChar;
    boolean nodeStarted = true;
    newString.append(currentChar);
    for (; i < length - 1;) {
        currentChar = unformattedXML.charAt(i++);
        if(((int) currentChar < 33) && !nodeStarted) {
            continue;
        }
        switch (currentChar) {
        case '<':
            if ('>' == previousChar && '/' != unformattedXML.charAt(i - 1) && '/' != unformattedXML.charAt(i) && '!' != unformattedXML.charAt(i)) {
                indentCount++;
            }
            newString.append(System.lineSeparator());
            for (int j = indentCount * indentSpace; j > 0; j--) {
                newString.append(space);
            }
            newString.append(currentChar);
            nodeStarted = true;
            break;
        case '>':
            newString.append(currentChar);
            nodeStarted = false;
            break;
        case '/':
            if ('<' == previousChar || '>' == unformattedXML.charAt(i)) {
                indentCount--;
            }
            newString.append(currentChar);
            break;
        default:
            newString.append(currentChar);
        }
        previousChar = currentChar;
    }
    newString.append(unformattedXML.charAt(length - 1));
    System.out.println(newString.toString());
}

关于“您必须首先构建DOM树”的评论:不,您不需要也不应该这样做。

相反,创建一个StreamSource(new StreamSource(new StringReader(str)),并将其提供给前面提到的标识转换器。这将使用SAX解析器,结果将快得多。 在这种情况下,构建中间树纯粹是开销。 否则,排名第一的答案是好的。

凯文·哈肯森说: 但是,如果您知道您的XML字符串是有效的,并且您不想引起将字符串解析为DOM的内存开销,然后在DOM上运行转换以获得字符串—您可以通过字符解析进行一些老式的字符。在每个字符后插入换行符和空格,保持和缩进计数器(以确定空格的数量),为每个<…>和递减你看到的每一个。”

同意了。这种方法要快得多,依赖关系也少得多。

示例解决方案:

/**
 * XML utils, including formatting.
 */
public class XmlUtils
{
  private static XmlFormatter formatter = new XmlFormatter(2, 80);

  public static String formatXml(String s)
  {
    return formatter.format(s, 0);
  }

  public static String formatXml(String s, int initialIndent)
  {
    return formatter.format(s, initialIndent);
  }

  private static class XmlFormatter
  {
    private int indentNumChars;
    private int lineLength;
    private boolean singleLine;

    public XmlFormatter(int indentNumChars, int lineLength)
    {
      this.indentNumChars = indentNumChars;
      this.lineLength = lineLength;
    }

    public synchronized String format(String s, int initialIndent)
    {
      int indent = initialIndent;
      StringBuilder sb = new StringBuilder();
      for (int i = 0; i < s.length(); i++)
      {
        char currentChar = s.charAt(i);
        if (currentChar == '<')
        {
          char nextChar = s.charAt(i + 1);
          if (nextChar == '/')
            indent -= indentNumChars;
          if (!singleLine)   // Don't indent before closing element if we're creating opening and closing elements on a single line.
            sb.append(buildWhitespace(indent));
          if (nextChar != '?' && nextChar != '!' && nextChar != '/')
            indent += indentNumChars;
          singleLine = false;  // Reset flag.
        }
        sb.append(currentChar);
        if (currentChar == '>')
        {
          if (s.charAt(i - 1) == '/')
          {
            indent -= indentNumChars;
            sb.append("\n");
          }
          else
          {
            int nextStartElementPos = s.indexOf('<', i);
            if (nextStartElementPos > i + 1)
            {
              String textBetweenElements = s.substring(i + 1, nextStartElementPos);

              // If the space between elements is solely newlines, let them through to preserve additional newlines in source document.
              if (textBetweenElements.replaceAll("\n", "").length() == 0)
              {
                sb.append(textBetweenElements + "\n");
              }
              // Put tags and text on a single line if the text is short.
              else if (textBetweenElements.length() <= lineLength * 0.5)
              {
                sb.append(textBetweenElements);
                singleLine = true;
              }
              // For larger amounts of text, wrap lines to a maximum line length.
              else
              {
                sb.append("\n" + lineWrap(textBetweenElements, lineLength, indent, null) + "\n");
              }
              i = nextStartElementPos - 1;
            }
            else
            {
              sb.append("\n");
            }
          }
        }
      }
      return sb.toString();
    }
  }

  private static String buildWhitespace(int numChars)
  {
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < numChars; i++)
      sb.append(" ");
    return sb.toString();
  }

  /**
   * Wraps the supplied text to the specified line length.
   * @lineLength the maximum length of each line in the returned string (not including indent if specified).
   * @indent optional number of whitespace characters to prepend to each line before the text.
   * @linePrefix optional string to append to the indent (before the text).
   * @returns the supplied text wrapped so that no line exceeds the specified line length + indent, optionally with
   * indent and prefix applied to each line.
   */
  private static String lineWrap(String s, int lineLength, Integer indent, String linePrefix)
  {
    if (s == null)
      return null;

    StringBuilder sb = new StringBuilder();
    int lineStartPos = 0;
    int lineEndPos;
    boolean firstLine = true;
    while(lineStartPos < s.length())
    {
      if (!firstLine)
        sb.append("\n");
      else
        firstLine = false;

      if (lineStartPos + lineLength > s.length())
        lineEndPos = s.length() - 1;
      else
      {
        lineEndPos = lineStartPos + lineLength - 1;
        while (lineEndPos > lineStartPos && (s.charAt(lineEndPos) != ' ' && s.charAt(lineEndPos) != '\t'))
          lineEndPos--;
      }
      sb.append(buildWhitespace(indent));
      if (linePrefix != null)
        sb.append(linePrefix);

      sb.append(s.substring(lineStartPos, lineEndPos + 1));
      lineStartPos = lineEndPos + 1;
    }
    return sb.toString();
  }

  // other utils removed for brevity
}
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
// initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

注意:根据Java版本的不同,结果可能有所不同。搜索特定于您的平台的解决方案。

稍微改进了milosmns的版本…

public static String getPrettyXml(String xml) {
    if (xml == null || xml.trim().length() == 0) return "";

    int stack = 0;
    StringBuilder pretty = new StringBuilder();
    String[] rows = xml.trim().replaceAll(">", ">\n").replaceAll("<", "\n<").split("\n");

    for (int i = 0; i < rows.length; i++) {
        if (rows[i] == null || rows[i].trim().length() == 0) continue;

        String row = rows[i].trim();
        if (row.startsWith("<?")) {
            pretty.append(row + "\n");
        } else if (row.startsWith("</")) {
            String indent = repeatString(--stack);
            pretty.append(indent + row + "\n");
        } else if (row.startsWith("<") && row.endsWith("/>") == false) {
            String indent = repeatString(stack++);
            pretty.append(indent + row + "\n");
            if (row.endsWith("]]>")) stack--;
        } else {
            String indent = repeatString(stack);
            pretty.append(indent + row + "\n");
        }
    }

    return pretty.toString().trim();
}

private static String repeatString(int stack) {
     StringBuilder indent = new StringBuilder();
     for (int i = 0; i < stack; i++) {
        indent.append(" ");
     }
     return indent.toString();
}