我如何使用jQuery解码字符串中的HTML实体?
当前回答
您不需要jQuery来解决这个问题,因为它会产生一些开销和依赖。
我知道这里有很多好的答案,但由于我实现了一个有点不同的方法,我想分享一下。
这段代码是一种非常安全的安全方法,因为转义处理程序依赖于浏览器,而不是函数。因此,如果将来会发现某些漏洞,则覆盖此解决方案。
const decodeHTMLEntities = text => {
// Create a new element or use one from cache, to save some element creation overhead
const el = decodeHTMLEntities.__cache_data_element
= decodeHTMLEntities.__cache_data_element
|| document.createElement('div');
const enc = text
// Prevent any mixup of existing pattern in text
.replace(/⪪/g, '⪪#')
// Encode entities in special format. This will prevent native element encoder to replace any amp characters
.replace(/&([a-z1-8]{2,31}|#x[0-9a-f]+|#\d+);/gi, '⪪$1⪫');
// Encode any HTML tags in the text to prevent script injection
el.textContent = enc;
// Decode entities from special format, back to their original HTML entities format
el.innerHTML = el.innerHTML
.replace(/⪪([a-z1-8]{2,31}|#x[0-9a-f]+|#\d+)⪫/gi, '&$1;')
.replace(/⪪#/g, '⪪');
// Get the decoded HTML entities
const dec = el.textContent;
// Clear the element content, in order to preserve a bit of memory (in case the text is big)
el.textContent = '';
return dec;
}
// Example
console.log(decodeHTMLEntities("<script>alert('∳∳∳∳⪪#x02233⪫');</script>"));
// Prints: <script>alert('∳∳∳∳⪪#x02233⪫');</script>
顺便说一下,我选择使用字符⪪和⪫,因为它们很少被使用,因此通过匹配它们影响性能的几率显著降低。
其他回答
对于ExtJS用户,如果你已经有了编码的字符串,例如当一个库函数的返回值是innerHTML内容时,考虑这个ExtJS函数:
Ext.util.Format.htmlDecode(innerHtmlContent)
您可以使用该库,从https://github.com/mathiasbynens/he获得
例子:
console.log(he.decode("Jörg & Jürgen rocked to & fro "));
// Logs "Jörg & Jürgen rocked to & fro"
我质疑这个库的作者,是否有任何理由在客户端代码中使用这个库来支持<textarea>黑客提供的其他答案在这里和其他地方。他提供了一些可能的理由:
If you're using node.js serverside, using a library for HTML encoding/decoding gives you a single solution that works both clientside and serverside. Some browsers' entity decoding algorithms have bugs or are missing support for some named character references. For example, Internet Explorer will both decode and render non-breaking spaces ( ) correctly but report them as ordinary spaces instead of non-breaking ones via a DOM element's innerText property, breaking the <textarea> hack (albeit only in a minor way). Additionally, IE 8 and 9 simply don't support any of the new named character references added in HTML 5. The author of he also hosts a test of named character reference support at http://mathias.html5.org/tests/html/named-character-references/. In IE 8, it reports over one thousand errors. If you want to be insulated from browser bugs related to entity decoding and/or be able to handle the full range of named character references, you can't get away with the <textarea> hack; you'll need a library like he. He just darn well feels like doing things this way is less hacky.
这里还有一个问题: 转义字符串分配给输入值时看起来不可读
var string = _.escape("<img src=fake onerror=alert('boo!')>");
$('input').val(string);
Exapmle: https://jsfiddle.net/kjpdwmqa/3/
使用jQuery解码HTML实体,只需使用这个函数:
function html_entity_decode(txt){
var randomID = Math.floor((Math.random()*100000)+1);
$('body').append('<div id="random'+randomID+'"></div>');
$('#random'+randomID).html(txt);
var entity_decoded = $('#random'+randomID).html();
$('#random'+randomID).remove();
return entity_decoded;
}
使用方法:
Javascript:
var txtEncoded = "á é í ó ú";
$('#some-id').val(html_entity_decode(txtEncoded));
HTML:
<input id="some-id" type="text" />
安全注意:使用这个答案(下面保留其原始形式)可能会在您的应用程序中引入XSS漏洞。你不应该用这个答案。阅读lucascaro对这个答案中漏洞的解释,并使用该答案或Mark Amery的答案中的方法。
实际上,试一试
var encodedStr = "This is fun &东西”; var解码= $ (" < div / > ") . html (encodedStr)。text (); console.log(解码); < script src = " https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js " > < /脚本> < div / >
推荐文章
- 使伸缩项目正确浮动
- Babel 6改变了它导出默认值的方式
- 如何配置历史记录?
- ES6模板文字可以在运行时被替换(或重用)吗?
- [Vue警告]:找不到元素
- 可以在setInterval()内部调用clearInterval()吗?
- AngularJS控制器的生命周期是什么?
- 无法读取未定义的属性“msie”- jQuery工具
- 形式内联内的形式水平在twitter bootstrap?
- 我的蛋蛋怎么不见了?
- JavaScript中的排列?
- 自定义元素在HTML5中有效吗?
- JavaScript中有睡眠/暂停/等待功能吗?
- 如何触发自动填充在谷歌Chrome?
- jQuery:执行同步AJAX请求