参数化SQL IN子句

我如何参数化一个包含有可变数量参数的IN子句的查询，就像这样?

SELECT * FROM Tags 
WHERE Name IN ('ruby','rails','scruffy','rubyonrails')
ORDER BY Count DESC

在这个查询中，参数的数量可以是1到5之间的任意值。

我不喜欢使用专门的存储过程(或XML)，但如果有一些特定于SQL Server 2008的优雅方式，我愿意接受。

当前回答

对于SQL Server 2008，可以使用表值参数。这有点麻烦，但可以说比我的其他方法更干净。

首先，您必须创建一个类型

CREATE TYPE dbo.TagNamesTableType AS TABLE ( Name nvarchar(50) )

然后，你的ADO。NET代码如下所示:

string[] tags = new string[] { "ruby", "rails", "scruffy", "rubyonrails" };
cmd.CommandText = "SELECT Tags.* FROM Tags JOIN @tagNames as P ON Tags.Name = P.Name";

// value must be IEnumerable<SqlDataRecord>
cmd.Parameters.AddWithValue("@tagNames", tags.AsSqlDataRecord("Name")).SqlDbType = SqlDbType.Structured;
cmd.Parameters["@tagNames"].TypeName = "dbo.TagNamesTableType";

// Extension method for converting IEnumerable<string> to IEnumerable<SqlDataRecord>
public static IEnumerable<SqlDataRecord> AsSqlDataRecord(this IEnumerable<string> values, string columnName) {
    if (values == null || !values.Any()) return null; // Annoying, but SqlClient wants null instead of 0 rows
    var firstRecord = values.First();
    var metadata= new SqlMetaData(columnName, SqlDbType.NVarChar, 50); //50 as per SQL Type
    return values.Select(v => 
    {
       var r = new SqlDataRecord(metadata);
       r.SetValues(v);
       return r;
    });
}

更新根据@Doug

请尽量避免var metadata = SqlMetaData。InferFromValue (firstRecord columnName);

它设置了第一个值的长度，所以如果第一个值是3个字符，那么它设置的最大长度为3，如果超过3个字符，其他记录将被截断。

因此，请尝试使用:var metadata= new SqlMetaData(columnName, SqlDbType. xml)。NVarChar maxLen);

注意:最大长度为-1。

2008-12-03 16:53:19

其他回答

可以将参数作为字符串传递

这是弦

DECLARE @tags

SET @tags = ‘ruby|rails|scruffy|rubyonrails’

select * from Tags 
where Name in (SELECT item from fnSplit(@tags, ‘|’))
order by Count desc

然后你所要做的就是将字符串作为1参数传递。

这是我使用的分裂函数。

CREATE FUNCTION [dbo].[fnSplit](
    @sInputList VARCHAR(8000) -- List of delimited items
  , @sDelimiter VARCHAR(8000) = ',' -- delimiter that separates items
) RETURNS @List TABLE (item VARCHAR(8000))

BEGIN
DECLARE @sItem VARCHAR(8000)
WHILE CHARINDEX(@sDelimiter,@sInputList,0) <> 0
 BEGIN
 SELECT
  @sItem=RTRIM(LTRIM(SUBSTRING(@sInputList,1,CHARINDEX(@sDelimiter,@sInputList,0)-1))),
  @sInputList=RTRIM(LTRIM(SUBSTRING(@sInputList,CHARINDEX(@sDelimiter,@sInputList,0)+LEN(@sDelimiter),LEN(@sInputList))))

 IF LEN(@sItem) > 0
  INSERT INTO @List SELECT @sItem
 END

IF LEN(@sInputList) > 0
 INSERT INTO @List SELECT @sInputList -- Put the last item in
RETURN
END

2008-12-03 16:27:11

这里有另一种选择。只需将一个以逗号分隔的列表作为字符串参数传递给存储过程，然后:

CREATE PROCEDURE [dbo].[sp_myproc]
    @UnitList varchar(MAX) = '1,2,3'
AS
select column from table
where ph.UnitID in (select * from CsvToInt(@UnitList))

函数:

CREATE Function [dbo].[CsvToInt] ( @Array varchar(MAX))
returns @IntTable table
(IntValue int)
AS
begin
    declare @separator char(1)
    set @separator = ','
    declare @separator_position int
    declare @array_value varchar(MAX)

    set @array = @array + ','

    while patindex('%,%' , @array) <> 0
    begin

        select @separator_position = patindex('%,%' , @array)
        select @array_value = left(@array, @separator_position - 1)

        Insert @IntTable
        Values (Cast(@array_value as int))
        select @array = stuff(@array, 1, @separator_position, '')
    end
    return
end

2013-04-06 02:39:54

这可能是一种有点讨厌的方法，我用过一次，相当有效。

根据你的目标，它可能会有用。

创建一个只有一列的临时表。将每个查找值插入到该列中。不使用IN，只需使用标准JOIN规则。(灵活性++)

这为您所能做的事情提供了一些额外的灵活性，但它更适合这样的情况:需要查询一个大型表，有良好的索引，并且希望多次使用参数化列表。节省了执行两次，所有的卫生工作都是手动完成的。

我从来没有时间去分析它到底有多快，但在我的情况下，它是需要的。

2008-12-03 17:04:00

最初的问题是“如何参数化查询……”

这不是最初那个问题的答案。在其他答案中有一些很好的演示。

请看Mark Brackett的第一个答案(第一个答案以“你可以参数化每个值”开头)和Mark Brackett的第二个答案，这是我(和其他231人)点赞的首选答案。在他的回答中给出的方法允许1)有效地使用绑定变量，2)谓词是sargable。

选择答案

我在这里讨论的是Joel Spolsky的回答中给出的方法，即“选择”答案作为正确答案。

Joel Spolsky的方法很聪明。它的工作原理是合理的，它将表现出可预测的行为和可预测的性能，给定“正常”值，并使用规范的边缘情况，如NULL和空字符串。对于特定的应用，它可能是足够的。

但是，在泛化这种方法方面，让我们还考虑更模糊的情况，比如Name列包含通配符(由like谓词识别)。我看到最常用的通配符是%(百分号)。我们先来解决这个问题，然后再讨论其他情况。

%字符有一些问题

考虑Name值为'pe%ter'。(对于这里的示例，我使用一个字面值字符串值来代替列名。)Name值为" pe%ter'的行将由以下形式的查询返回:

select ...
 where '|peanut|butter|' like '%|' + 'pe%ter' + '|%'

但是，如果搜索词的顺序颠倒，则不会返回同一行:

select ...
 where '|butter|peanut|' like '%|' + 'pe%ter' + '|%'

我们观察到的行为有点奇怪。更改列表中搜索词的顺序将更改结果集。

不用说，我们可能不希望孩子吃花生酱，不管他多么喜欢花生酱。

晦涩的角落案例

(是的，我同意这是一个模糊的案例。可能是一个不太可能被测试的。我们不期望列值中有通配符。我们可以假设应用程序阻止存储这样的值。但根据我的经验，我很少看到数据库约束明确禁止LIKE比较运算符右侧的通配符或模式。

修补洞

修补此漏洞的一种方法是转义%通配符。(对于不熟悉操作符上的转义子句的人，这里有一个SQL Server文档的链接。

select ...
 where '|peanut|butter|'
  like '%|' + 'pe\%ter' + '|%' escape '\'

现在我们可以匹配字面%了。当然，当我们有一个列名时，我们需要动态转义通配符。我们可以使用REPLACE函数查找%字符的出现情况，并在每个字符前面插入一个反斜杠字符，如下所示:

select ...
 where '|pe%ter|'
  like '%|' + REPLACE( 'pe%ter' ,'%','\%') + '|%' escape '\'

这样就解决了%通配符的问题。几乎。

逃离逃离

我们认识到我们的解决方案引入了另一个问题。转义字符。我们还需要对任何出现的转义字符本身进行转义。这一次，我们使用!作为转义字符:

select ...
 where '|pe%t!r|'
  like '%|' + REPLACE(REPLACE( 'pe%t!r' ,'!','!!'),'%','!%') + '|%' escape '!'

还有下划线

现在，我们可以添加另一个REPLACE句柄，即下划线通配符。只是为了好玩，这次我们将使用$作为转义字符。

select ...
 where '|p_%t!r|'
  like '%|' + REPLACE(REPLACE(REPLACE( 'p_%t!r' ,'$','$$'),'%','$%'),'_','$_') + '|%' escape '$'

我更喜欢这种方法而不是转义，因为它可以在Oracle和MySQL以及SQL Server中工作。(我通常使用\反斜杠作为转义字符，因为这是正则表达式中使用的字符。但为什么要被传统束缚呢!

这些讨厌的括号

SQL Server also allows for wildcard characters to be treated as literals by enclosing them in brackets []. So we're not done fixing yet, at least for SQL Server. Since pairs of brackets have special meaning, we'll need to escape those as well. If we manage to properly escape the brackets, then at least we won't have to bother with the hyphen - and the carat ^ within the brackets. And we can leave any % and _ characters inside the brackets escaped, since we'll have basically disabled the special meaning of the brackets.

找到匹配的括号对应该没有那么难。这比处理单例%和_的出现要困难一些。(注意，仅仅转义所有出现的方括号是不够的，因为单例方括号被认为是一个文字，不需要转义。逻辑变得有点模糊，如果不运行更多的测试用例，我就无法处理。)

内联表达式变得混乱

SQL中的内联表达式越来越长，越来越难看。我们也许可以让它工作，但上帝保佑那些可怜的灵魂回来，必须破译它。作为内联表达式的粉丝，我在这里倾向于不使用它，主要是因为我不想留下评论解释混乱的原因，并为此道歉。

函数在哪里?

如果我们不把它作为SQL中的内联表达式来处理，我们拥有的最接近的替代方法是用户定义函数。而且我们知道这不会加快任何速度(除非我们可以在上面定义一个索引，就像我们在Oracle中所做的那样)。如果我们必须创建一个函数，最好在调用SQL语句的代码中执行。

该函数可能在行为上有一些差异，这取决于DBMS和版本。(这是对所有热衷于可互换使用任何数据库引擎的Java开发人员的一种呼吁。)

领域知识

我们可能对列的域有专门的知识(也就是说，对列强制执行的允许值集。我们可能预先知道，存储在列中的值永远不会包含百分号、下划线或括号对。在这种情况下，我们只包含一个简短的注释，说明这些情况都被涵盖了。

存储在列中的值可能允许使用%或_字符，但约束可能要求这些值转义，可能使用已定义的字符，这样这些值比较“安全”。再次，快速评论一下允许的值集，特别是哪个字符被用作转义字符，并遵循Joel Spolsky的方法。

但是，在没有专业知识和保证的情况下，我们至少要考虑处理那些模糊的极端情况，并考虑行为是否合理，是否“符合规范”。

其他问题概述

我相信其他人已经充分指出了其他一些普遍考虑的关切领域:

SQL injection (taking what would appear to be user supplied information, and including that in the SQL text rather than supplying them through bind variables. Using bind variables isn't required, it's just one convenient approach to thwart with SQL injection. There are other ways to deal with it: optimizer plan using index scan rather than index seeks, possible need for an expression or function for escaping wildcards (possible index on expression or function) using literal values in place of bind variables impacts scalability

结论

我喜欢Joel Spolsky的方法。这是聪明的。这很有效。

但当我看到它的时候，我立刻发现了潜在的问题，而我的天性不是让它顺其自然。我并不是要批评别人的努力。我知道许多开发者都非常注重自己的工作，因为他们在其中投入了大量精力，并且非常关心自己的工作。所以请理解，这不是人身攻击。我在这里确定的是在生产而不是测试中出现的问题类型。

2009-05-29 23:18:15

I think this is a case when a static query is just not the way to go. Dynamically build the list for your in clause, escape your single quotes, and dynamically build SQL. In this case you probably won't see much of a difference with any method due to the small list, but the most efficient method really is to send the SQL exactly as it is written in your post. I think it is a good habit to write it the most efficient way, rather than to do what makes the prettiest code, or consider it bad practice to dynamically build SQL.

I have seen the split functions take longer to execute than the query themselves in many cases where the parameters get large. A stored procedure with table valued parameters in SQL 2008 is the only other option I would consider, although this will probably be slower in your case. TVP will probably only be faster for large lists if you are searching on the primary key of the TVP, because SQL will build a temporary table for the list anyway (if the list is large). You won't know for sure unless you test it.

I have also seen stored procedures that had 500 parameters with default values of null, and having WHERE Column1 IN (@Param1, @Param2, @Param3, ..., @Param500). This caused SQL to build a temp table, do a sort/distinct, and then do a table scan instead of an index seek. That is essentially what you would be doing by parameterizing that query, although on a small enough scale that it won't make a noticeable difference. I highly recommend against having NULL in your IN lists, as if that gets changed to a NOT IN it will not act as intended. You could dynamically build the parameter list, but the only obvious thing that you would gain is that the objects would escape the single quotes for you. That approach is also slightly slower on the application end since the objects have to parse the query to find the parameters. It may or may not be faster on SQL, as parameterized queries call sp_prepare, sp_execute for as many times you execute the query, followed by sp_unprepare.

重用存储过程或参数化查询的执行计划可能会提高性能，但它会将您锁定在由执行的第一个查询决定的执行计划中。在许多情况下，这对于后续查询可能不太理想。在您的情况下，重用执行计划可能是一个加分项，但它可能根本没有任何区别，因为示例是一个非常简单的查询。

悬崖笔记:

对于您的情况，您所做的任何事情，无论是使用列表中固定数量的项进行参数化(如果不使用则为空)，动态地构建带有或不带有参数的查询，还是使用带有表值参数的存储过程，都不会产生太大的区别。不过，我的一般建议如下:

你的case/简单查询很少参数:

动态SQL，如果测试显示更好的性能，可能会使用参数。

具有可重用执行计划的查询，通过简单地更改参数或如果查询很复杂则调用多次:

带有动态参数的SQL。

带有大列表的查询:

具有表值参数的存储过程。如果列表变化很大，则在存储过程上使用WITH RECOMPILE，或者简单地使用不带参数的动态SQL为每个查询生成新的执行计划。

2010-06-09 20:28:50

参数化SQL IN子句

推荐文章

最新文章

标签