从文本JavaScript中删除HTML

有没有一种简单的方法可以在JavaScript中获取一个html字符串并去掉html？

当前回答

简单的2行jquery去掉html。

 var content = "<p>checking the html source&nbsp;</p><p>&nbsp;
  </p><p>with&nbsp;</p><p>all</p><p>the html&nbsp;</p><p>content</p>";

 var text = $(content).text();//It gets you the plain text
 console.log(text);//check the data in your console

 cj("#text_area_id").val(text);//set your content to text area using text_area_id

2013-07-05 09:18:26

其他回答

方法1：

function cleanHTML(str){
  str.replace(/<(?<=<)(.*?)(?=>)>/g, '&lt;$1&gt;');
}

function uncleanHTML(str){
  str.replace(/&lt;(?<=&lt;)(.*?)(?=&gt;)&gt;/g, '<$1>');
}

方法2：

function cleanHTML(str){
  str.replace(/</g, '&lt;').replace(/>/g, '&gt;');
}

function uncleanHTML(str){
  str.replace(/&lt;/g, '<').replace(/&gt;/g, '>');
}

此外，不要忘记，如果用户碰巧发布了一条数学评论（例如：1<2），您不想删除整个评论。浏览器（仅测试了chrome）不将unicode作为html标记运行。如果将所有<替换为&lt；字符串中的每一个文件，unicode都将显示<为文本，而不运行任何html。我推荐方法2。jquery也能很好地工作$（'#element'）.text（）；

2019-12-14 21:28:33

简单的2行jquery去掉html。

 var content = "<p>checking the html source&nbsp;</p><p>&nbsp;
  </p><p>with&nbsp;</p><p>all</p><p>the html&nbsp;</p><p>content</p>";

 var text = $(content).text();//It gets you the plain text
 console.log(text);//check the data in your console

 cj("#text_area_id").val(text);//set your content to text area using text_area_id

2013-07-05 09:18:26

还可以使用出色的htmlparser2纯JSHTML解析器。这里是一个工作演示：

var htmlparser = require('htmlparser2');

var body = '<p><div>This is </div>a <span>simple </span> <img src="test"></img>example.</p>';

var result = [];

var parser = new htmlparser.Parser({
    ontext: function(text){
        result.push(text);
    }
}, {decodeEntities: true});

parser.write(body);
parser.end();

result.join('');

输出将是这是一个简单的示例。

请在此处查看实际操作：https://tonicdev.com/jfahrenkrug/extract-text-from-html

如果您使用类似webpack的工具打包web应用程序，则这在节点和浏览器中都有效。

2015-12-29 19:11:59

对公认答案的改进。

function strip(html)
{
   var tmp = document.implementation.createHTMLDocument("New").body;
   tmp.innerHTML = html;
   return tmp.textContent || tmp.innerText || "";
}

这样一来，像这样运行的东西不会造成任何伤害：

strip("<img onerror='alert(\"could run arbitrary JS here\")' src=bogus>")

Firefox、Chromium和Explorer 9+是安全的。歌剧院普雷斯托仍然很脆弱。字符串中提到的图像也不会在Chromium和Firefox中保存http请求。

2013-07-31 20:14:59

用jQuery剥离html的一种更安全的方法是，首先使用jQuery.parseHTML创建DOM，忽略任何脚本，然后让jQuery构建元素，然后仅检索文本。

function stripHtml(unsafe) {
    return $($.parseHTML(unsafe)).text();
}

可以安全地从以下位置剥离html：

<img src="unknown.gif" onerror="console.log('running injections');">

以及其他漏洞。

nJoy！

2019-03-25 20:44:36

从文本JavaScript中删除HTML

推荐文章

最新文章

标签