从字符串中删除HTML标签

是否有一个好方法从Java字符串中删除HTML ?一个简单的正则表达式

replaceAll("\\<.*?>", "")

会起作用，但有些东西像&将不能正确地转换，并且两个尖括号之间的非html将被删除(即。*?在正则表达式中将消失)。

当前回答

我认为过滤html标签最简单的方法是:

private static final Pattern REMOVE_TAGS = Pattern.compile("<.+?>");

public static String removeTags(String string) {
    if (string == null || string.length() == 0) {
        return string;
    }

    Matcher m = REMOVE_TAGS.matcher(string);
    return m.replaceAll("");
}

2010-11-04 10:13:09

其他回答

你可以简单地使用Android默认的HTML过滤器

    public String htmlToStringFilter(String textToFilter){

    return Html.fromHtml(textToFilter).toString();

    }

上面的方法将为您的输入返回经过HTML过滤的字符串。

2019-03-29 08:37:20

我经常发现我只需要去掉注释和脚本元素。这已经为我可靠地工作了15年，可以很容易地扩展到处理HTML或XML中的任何元素名称:

// delete all comments
response = response.replaceAll("<!--[^>]*-->", "");
// delete all script elements
response = response.replaceAll("<(script|SCRIPT)[^+]*?>[^>]*?<(/script|SCRIPT)>", "");

2020-08-23 21:14:52

在Android上试试这个:

String result = Html.fromHtml(html).toString();

2015-05-04 04:29:30

从字符串中删除HTML标签。在某个地方，我们需要解析一些字符串，这些字符串是由服务器端的Httpresponse等响应接收到的。

所以我们需要解析它。

在这里，我将展示如何从字符串中删除html标签。

    // sample text with tags

    string str = "<html><head>sdfkashf sdf</head><body>sdfasdf</body></html>";



    // regex which match tags

    System.Text.RegularExpressions.Regex rx = new System.Text.RegularExpressions.Regex("<[^>]*>");



    // replace all matches with empty strin

    str = rx.Replace(str, "");



    //now str contains string without html tags

2014-09-03 16:02:18

你可以使用这个方法从字符串中删除HTML标签，

public static String stripHtmlTags(String html) {

    return html.replaceAll("<.*?>", "");

}

2021-03-01 15:44:46

从字符串中删除HTML标签

推荐文章

最新文章

标签