是否存在一种万能函数,可以很好地为SQL注入和XSS攻击清除用户输入,同时仍然允许某些类型的HTML标记?


当前回答

PHP过滤器扩展有许多检查外部用户输入所需的功能&它的设计是为了使数据消毒更容易和更快。

PHP过滤器可以轻松地清理和验证外部输入。

其他回答

PHP过滤器扩展有许多检查外部用户输入所需的功能&它的设计是为了使数据消毒更容易和更快。

PHP过滤器可以轻松地清理和验证外部输入。

你所描述的是两个独立的问题:

消毒/过滤用户输入数据。 转义输出。

1)用户输入应该总是被认为是糟糕的。

使用准备语句,或/和使用mysql_real_escape_string进行过滤绝对是必须的。 PHP还内置了filter_input,这是一个很好的开始。

2)这是一个很大的主题,它取决于输出数据的上下文。对于HTML,有一些解决方案,比如htmlpurifier。 作为经验法则,总是对输出的任何内容进行转义。

这两个问题都太大了,无法在一篇文章中详细讨论,但有很多文章会更详细地介绍:

PHP输出

更安全的PHP输出

用PHP清除用户输入的方法:

Use Modern Versions of MySQL and PHP. Set charset explicitly: $mysqli->set_charset("utf8");manual $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);manual $pdo->exec("set names utf8");manual $pdo = new PDO( "mysql:host=$host;dbname=$db", $user, $pass, array( PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION, PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8" ) );manual mysql_set_charset('utf8') [deprecated in PHP 5.5.0, removed in PHP 7.0.0]. Use secure charsets: Select utf8, latin1, ascii.., dont use vulnerable charsets big5, cp932, gb2312, gbk, sjis. Use spatialized function: MySQLi prepared statements: $stmt = $mysqli->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); $param = "' OR 1=1 /*";$stmt->bind_param('s', $param);$stmt->execute(); PDO::quote() - places quotes around the input string (if required) and escapes special characters within the input string, using a quoting style appropriate to the underlying driver:$pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection)$var = $pdo->quote("' OR 1=1 /*");not only escapes the literal, but also quotes it (in single-quote ' characters) $stmt = $pdo->query("SELECT * FROM test WHERE name = $var LIMIT 1"); PDO Prepared Statements: vs MySQLi prepared statements supports more database drivers and named parameters: $pdo = new PDO('mysql:host=localhost;dbname=testdb;charset=UTF8', $user, $password);explicit set the character set$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);disable emulating prepared statements to prevent fallback to emulating statements that MySQL can't prepare natively (to prevent injection) $stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1'); $stmt->execute(["' OR 1=1 /*"]); mysql_real_escape_string [deprecated in PHP 5.5.0, removed in PHP 7.0.0]. mysqli_real_escape_string Escapes special characters in a string for use in an SQL statement, taking into account the current charset of the connection. But recommended to use Prepared Statements because they are not simply escaped strings, a statement comes up with a complete query execution plan, including which tables and indexes it would use, it is a optimized way. Use single quotes (' ') around your variables inside your query. Check the variable contains what you are expecting for: If you are expecting an integer, use: ctype_digit — Check for numeric character(s);$value = (int) $value;$value = intval($value);$var = filter_var('0755', FILTER_VALIDATE_INT, $options); For Strings use: is_string() — Find whether the type of a variable is stringUse Filter Function filter_var() — filters a variable with a specified filter:$email = filter_var($email, FILTER_SANITIZE_EMAIL);$newstr = filter_var($str, FILTER_SANITIZE_STRING);more predefined filters filter_input() — Gets a specific external variable by name and optionally filters it:$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS); preg_match() — Perform a regular expression match; Write Your own validation function.

认为用户输入可以过滤是一种常见的误解。PHP甚至有一个(现在已弃用)“特征”,被称为魔术引号,建立在这个想法上。这是无稽之谈。忘记过滤(或清洗,或人们所说的任何东西)。

What you should do, to avoid problems, is quite simple: whenever you embed a a piece of data within a foreign code, you must treat it according to the formatting rules of that code. But you must understand that such rules could be too complicated to try to follow them all manually. For example, in SQL, rules for strings, numbers and identifiers are all different. For your convenience, in most cases there is a dedicated tool for such an embedding. For example, when you need to use a PHP variable in the SQL query, you have to use a prepared statement, that will take care of all the proper formatting/treatment.

另一个例子是HTML:如果你在HTML标记中嵌入字符串,你必须使用htmlspecialchars来转义它。这意味着每个echo或print语句都应该使用htmlspecialchars。

第三个例子可能是shell命令:如果您打算将字符串(如参数)嵌入到外部命令中,并使用exec调用它们,那么您必须使用escapeshellcmd和escapeshellarg。

还有一个非常引人注目的例子是JSON。规则是如此之多和复杂,你永远无法手动遵循它们。这就是为什么你永远不应该手动创建JSON字符串,而总是使用一个专门的函数,json_encode(),它将正确地格式化每一位数据。

诸如此类……

您需要主动过滤数据的唯一情况是,如果您接受预格式化的输入。例如,如果您让用户发布您计划在站点上显示的HTML标记。但是,您应该明智地不惜一切代价避免这种情况,因为无论您如何过滤它,它始终是一个潜在的安全漏洞。

PHP现在有了新的不错的filter_input函数,例如,现在有了内置的FILTER_VALIDATE_EMAIL类型,可以将您从寻找“最终的电子邮件正则表达式”中解放出来


我自己的过滤器类(使用JavaScript突出显示有错误的字段)可以由ajax请求或普通表单post发起。(见下面的例子) <? /** *猪肉成型验证器。通过正则表达式验证字段,并可以清除它们。使用PHP filter_var内置函数和额外的正则表达式 *包装猪肉 * /

/**
 *  Pork.FormValidator
 *  Validates arrays or properties by setting up simple arrays. 
 *  Note that some of the regexes are for dutch input!
 *  Example:
 * 
 *  $validations = array('name' => 'anything','email' => 'email','alias' => 'anything','pwd'=>'anything','gsm' => 'phone','birthdate' => 'date');
 *  $required = array('name', 'email', 'alias', 'pwd');
 *  $sanitize = array('alias');
 *
 *  $validator = new FormValidator($validations, $required, $sanitize);
 *                  
 *  if($validator->validate($_POST))
 *  {
 *      $_POST = $validator->sanitize($_POST);
 *      // now do your saving, $_POST has been sanitized.
 *      die($validator->getScript()."<script type='text/javascript'>alert('saved changes');</script>");
 *  }
 *  else
 *  {
 *      die($validator->getScript());
 *  }   
 *  
 * To validate just one element:
 * $validated = new FormValidator()->validate('blah@bla.', 'email');
 * 
 * To sanitize just one element:
 * $sanitized = new FormValidator()->sanitize('<b>blah</b>', 'string');
 * 
 * @package pork
 * @author SchizoDuckie
 * @copyright SchizoDuckie 2008
 * @version 1.0
 * @access public
 */
class FormValidator
{
    public static $regexes = Array(
            'date' => "^[0-9]{1,2}[-/][0-9]{1,2}[-/][0-9]{4}\$",
            'amount' => "^[-]?[0-9]+\$",
            'number' => "^[-]?[0-9,]+\$",
            'alfanum' => "^[0-9a-zA-Z ,.-_\\s\?\!]+\$",
            'not_empty' => "[a-z0-9A-Z]+",
            'words' => "^[A-Za-z]+[A-Za-z \\s]*\$",
            'phone' => "^[0-9]{10,11}\$",
            'zipcode' => "^[1-9][0-9]{3}[a-zA-Z]{2}\$",
            'plate' => "^([0-9a-zA-Z]{2}[-]){2}[0-9a-zA-Z]{2}\$",
            'price' => "^[0-9.,]*(([.,][-])|([.,][0-9]{2}))?\$",
            '2digitopt' => "^\d+(\,\d{2})?\$",
            '2digitforce' => "^\d+\,\d\d\$",
            'anything' => "^[\d\D]{1,}\$"
    );
    private $validations, $sanatations, $mandatories, $errors, $corrects, $fields;
    

    public function __construct($validations=array(), $mandatories = array(), $sanatations = array())
    {
        $this->validations = $validations;
        $this->sanitations = $sanitations;
        $this->mandatories = $mandatories;
        $this->errors = array();
        $this->corrects = array();
    }

    /**
     * Validates an array of items (if needed) and returns true or false
     *
     */
    public function validate($items)
    {
        $this->fields = $items;
        $havefailures = false;
        foreach($items as $key=>$val)
        {
            if((strlen($val) == 0 || array_search($key, $this->validations) === false) && array_search($key, $this->mandatories) === false) 
            {
                $this->corrects[] = $key;
                continue;
            }
            $result = self::validateItem($val, $this->validations[$key]);
            if($result === false) {
                $havefailures = true;
                $this->addError($key, $this->validations[$key]);
            }
            else
            {
                $this->corrects[] = $key;
            }
        }
    
        return(!$havefailures);
    }

    /**
     *
     *  Adds unvalidated class to thos elements that are not validated. Removes them from classes that are.
     */
    public function getScript() {
        if(!empty($this->errors))
        {
            $errors = array();
            foreach($this->errors as $key=>$val) { $errors[] = "'INPUT[name={$key}]'"; }

            $output = '$$('.implode(',', $errors).').addClass("unvalidated");'; 
            $output .= "new FormValidator().showMessage();";
        }
        if(!empty($this->corrects))
        {
            $corrects = array();
            foreach($this->corrects as $key) { $corrects[] = "'INPUT[name={$key}]'"; }
            $output .= '$$('.implode(',', $corrects).').removeClass("unvalidated");';   
        }
        $output = "<script type='text/javascript'>{$output} </script>";
        return($output);
    }


    /**
     *
     * Sanitizes an array of items according to the $this->sanitations
     * sanitations will be standard of type string, but can also be specified.
     * For ease of use, this syntax is accepted:
     * $sanitations = array('fieldname', 'otherfieldname'=>'float');
     */
    public function sanitize($items)
    {
        foreach($items as $key=>$val)
        {
            if(array_search($key, $this->sanitations) === false && !array_key_exists($key, $this->sanitations)) continue;
            $items[$key] = self::sanitizeItem($val, $this->validations[$key]);
        }
        return($items);
    }


    /**
     *
     * Adds an error to the errors array.
     */ 
    private function addError($field, $type='string')
    {
        $this->errors[$field] = $type;
    }

    /**
     *
     * Sanitize a single var according to $type.
     * Allows for static calling to allow simple sanitization
     */
    public static function sanitizeItem($var, $type)
    {
        $flags = NULL;
        switch($type)
        {
            case 'url':
                $filter = FILTER_SANITIZE_URL;
            break;
            case 'int':
                $filter = FILTER_SANITIZE_NUMBER_INT;
            break;
            case 'float':
                $filter = FILTER_SANITIZE_NUMBER_FLOAT;
                $flags = FILTER_FLAG_ALLOW_FRACTION | FILTER_FLAG_ALLOW_THOUSAND;
            break;
            case 'email':
                $var = substr($var, 0, 254);
                $filter = FILTER_SANITIZE_EMAIL;
            break;
            case 'string':
            default:
                $filter = FILTER_SANITIZE_STRING;
                $flags = FILTER_FLAG_NO_ENCODE_QUOTES;
            break;
             
        }
        $output = filter_var($var, $filter, $flags);        
        return($output);
    }
    
    /** 
     *
     * Validates a single var according to $type.
     * Allows for static calling to allow simple validation.
     *
     */
    public static function validateItem($var, $type)
    {
        if(array_key_exists($type, self::$regexes))
        {
            $returnval =  filter_var($var, FILTER_VALIDATE_REGEXP, array("options"=> array("regexp"=>'!'.self::$regexes[$type].'!i'))) !== false;
            return($returnval);
        }
        $filter = false;
        switch($type)
        {
            case 'email':
                $var = substr($var, 0, 254);
                $filter = FILTER_VALIDATE_EMAIL;    
            break;
            case 'int':
                $filter = FILTER_VALIDATE_INT;
            break;
            case 'boolean':
                $filter = FILTER_VALIDATE_BOOLEAN;
            break;
            case 'ip':
                $filter = FILTER_VALIDATE_IP;
            break;
            case 'url':
                $filter = FILTER_VALIDATE_URL;
            break;
        }
        return ($filter === false) ? false : filter_var($var, $filter) !== false ? true : false;
    }       
    


}

当然,请记住,您也需要根据您使用的db类型进行sql查询转义(mysql_real_escape_string()对于sql server来说是无用的)。您可能希望在适当的应用程序层(如ORM)自动处理这个问题。另外,如上所述:对于输出到html使用其他php专用函数,如htmlspecialchars;)

真正允许HTML输入的类和/或标签依赖于一个专用的xss验证包。不要编写自己的正则表达式来解析html !