在纯Java代码中输出HTML时,是否有一种推荐的方法来转义<,>,"和&字符?(除了手动执行以下操作之外)。

String source = "The less than sign (<) and ampersand (&) must be escaped before using them in HTML";
String escaped = source.replace("<", "&lt;").replace("&", "&amp;"); // ...

当前回答

Apache Commons的替代方案:使用Spring的htmltils。htmlEscape(字符串输入)方法。

其他回答

stringescapeutils现在已弃用。您现在必须使用org.apache.commons.text.StringEscapeUtils by

    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-text</artifactId>
        <version>${commons.text.version}</version>
    </dependency>

StringEscapeUtils from Apache Commons Lang:

import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;
// ...
String source = "The less than sign (<) and ampersand (&) must be escaped before using them in HTML";
String escaped = escapeHtml(source);

版本3:

import static org.apache.commons.lang3.StringEscapeUtils.escapeHtml4;
// ...
String escaped = escapeHtml4(source);

简单的方法:

public static String escapeHTML(String s) {
    StringBuilder out = new StringBuilder(Math.max(16, s.length()));
    for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        if (c > 127 || c == '"' || c == '\'' || c == '<' || c == '>' || c == '&') {
            out.append("&#");
            out.append((int) c);
            out.append(';');
        } else {
            out.append(c);
        }
    }
    return out.toString();
}

基于https://stackoverflow.com/a/8838023/1199155(放大器不在那里)。根据http://www.w3.org/TR/html4/sgml/entities.html的说法,if子句中选中的四个字符是唯一低于128的字符

有一个更新版本的Apache Commons Lang库,它使用了一个不同的包名(org.apache.commons.lang3)。StringEscapeUtils现在有不同的静态方法来转义不同类型的文档(http://commons.apache.org/proper/commons-lang/javadocs/api-3.0/index.html)。转义HTML 4.0版本的字符串:

import static org.apache.commons.lang3.StringEscapeUtils.escapeHtml4;

String output = escapeHtml4("The less than sign (<) and ampersand (&) must be escaped before using them in HTML");

Be careful with this. There are a number of different 'contexts' within an HTML document: Inside an element, quoted attribute value, unquoted attribute value, URL attribute, javascript, CSS, etc... You'll need to use a different encoding method for each of these to prevent Cross-Site Scripting (XSS). Check the OWASP XSS Prevention Cheat Sheet for details on each of these contexts. You can find escaping methods for each of these contexts in the OWASP ESAPI library -- https://github.com/ESAPI/esapi-java-legacy.