我如何防止XSS(跨站点脚本)只使用HTML和PHP?

关于这个主题,我已经看到了许多其他的帖子,但是我还没有找到一篇文章能够清楚而简洁地说明如何实际防止XSS。


当前回答

在PHP上使用htmlspecialchars。在HTML上尽量避免使用:

元素。innerHTML = "…"; 元素。outerHTML = "…"; document . write(…); document.writeln(…);

其中var由用户控制。

显然也要避免eval(var), 如果你必须使用它们中的任何一个,然后尝试JS转义它们,HTML转义它们,你可能需要做更多的事情,但对于基础知识,这应该足够了。

其他回答

<?php
function xss_clean($data)
{
// Fix &entity\n;
$data = str_replace(array('&amp;','&lt;','&gt;'), array('&amp;amp;','&amp;lt;','&amp;gt;'), $data);
$data = preg_replace('/(&#*\w+)[\x00-\x20]+;/u', '$1;', $data);
$data = preg_replace('/(&#x*[0-9A-F]+);*/iu', '$1;', $data);
$data = html_entity_decode($data, ENT_COMPAT, 'UTF-8');

// Remove any attribute starting with "on" or xmlns
$data = preg_replace('#(<[^>]+?[\x00-\x20"\'])(?:on|xmlns)[^>]*+>#iu', '$1>', $data);

// Remove javascript: and vbscript: protocols
$data = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2nojavascript...', $data);
$data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2novbscript...', $data);
$data = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '$1=$2nomozbinding...', $data);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#i', '$1>', $data);
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#i', '$1>', $data);
$data = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#iu', '$1>', $data);

// Remove namespaced elements (we do not need them)
$data = preg_replace('#</*\w+:\w[^>]*+>#i', '', $data);

do
{
    // Remove really unwanted tags
    $old_data = $data;
    $data = preg_replace('#</*(?:applet|b(?:ase|gsound|link)|embed|frame(?:set)?|i(?:frame|layer)|l(?:ayer|ink)|meta|object|s(?:cript|tyle)|title|xml)[^>]*+>#i', '', $data);
}
while ($old_data !== $data);

// we are done...
return $data;
}

你也可以通过header(…)设置一些XSS相关的HTTP响应头

X-XSS-Protection”1;模式=块”

可以肯定的是,浏览器XSS保护模式是启用的。

Content-Security-Policy "default-src 'self';……”

启用浏览器端内容安全性。有关内容安全策略(CSP)的详细信息,请参阅:http://content-security-policy.com/ 特别是设置CSP来阻止内联脚本和外部脚本源有助于对抗XSS。

关于你的web应用程序安全性的一堆有用的HTTP响应头,请查看OWASP: https://www.owasp.org/index.php/List_of_useful_HTTP_headers

按偏好顺序排列:

If you are using a templating engine (e.g. Twig, Smarty, Blade), check that it offers context-sensitive escaping. I know from experience that Twig does. {{ var|e('html_attr') }} If you want to allow HTML, use HTML Purifier. Even if you think you only accept Markdown or ReStructuredText, you still want to purify the HTML these markup languages output. Otherwise, use htmlentities($var, ENT_QUOTES | ENT_HTML5, $charset) and make sure the rest of your document uses the same character set as $charset. In most cases, 'UTF-8' is the desired character set.

另外,确保在输出上转义,而不是在输入上转义。

基本上,当您想要向浏览器输出来自用户输入的内容时,就需要使用htmlspecialchars()函数。

使用这个函数的正确方法是这样的:

echo htmlspecialchars($string, ENT_QUOTES, 'UTF-8');

谷歌Code University也有这些非常有教育意义的Web安全视频:

如何打破网络软件-看看安全漏洞 网络软件 关于安全,每个工程师都需要知道什么 以及在哪里学习

Many frameworks help handle XSS in various ways. When rolling your own or if there's some XSS concern, we can leverage filter_input_array (available in PHP 5 >= 5.2.0, PHP 7.) I typically will add this snippet to my SessionController, because all calls go through there before any other controller interacts with the data. In this manner, all user input gets sanitized in 1 central location. If this is done at the beginning of a project or before your database is poisoned, you shouldn't have any issues at time of output...stops garbage in, garbage out.

/* Prevent XSS input */
$_GET   = filter_input_array(INPUT_GET, FILTER_SANITIZE_STRING);
$_POST  = filter_input_array(INPUT_POST, FILTER_SANITIZE_STRING);
/* I prefer not to use $_REQUEST...but for those who do: */
$_REQUEST = (array)$_POST + (array)$_GET + (array)$_REQUEST;

以上将删除所有HTML和脚本标签。如果您需要一种基于白名单的安全标记解决方案,请查看HTML Purifier。


如果你的数据库已经中毒了,或者你想在输出时处理XSS, OWASP建议为echo创建一个自定义包装器函数,并在输出用户提供的值时使用它:

//xss mitigation functions
function xssafe($data,$encoding='UTF-8')
{
   return htmlspecialchars($data,ENT_QUOTES | ENT_HTML401,$encoding);
}
function xecho($data)
{
   echo xssafe($data);
}