下面举几个例子,以防万一:
内联表值
CREATE FUNCTION MyNS.GetUnshippedOrders()
RETURNS TABLE
AS
RETURN SELECT a.SaleId, a.CustomerID, b.Qty
FROM Sales.Sales a INNER JOIN Sales.SaleDetail b
ON a.SaleId = b.SaleId
INNER JOIN Production.Product c ON b.ProductID = c.ProductID
WHERE a.ShipDate IS NULL
GO
多语句表取值
CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT)
RETURNS @CustomerOrder TABLE
(SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL)
AS
BEGIN
DECLARE @MaxDate DATETIME
SELECT @MaxDate = MAX(OrderDate)
FROM Sales.SalesOrderHeader
WHERE CustomerID = @CustomerID
INSERT @CustomerOrder
SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
FROM Sales.SalesOrderHeader a INNER JOIN Sales.SalesOrderHeader b
ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c ON b.ProductID = c.ProductID
WHERE a.OrderDate = @MaxDate
AND a.CustomerID = @CustomerID
RETURN
END
GO
使用一种类型(内联或多语句)是否比另一种有优势?是否存在一个比另一个更好的情况,或者这种差异纯粹是句法上的?我意识到这两个示例查询正在做不同的事情,但有一个原因我将以这种方式编写它们吗?
读到它们,它们的优点/区别并没有真正被解释清楚。
如果你要做一个查询,你可以在你的内联表值函数中加入:
SELECT
a.*,b.*
FROM AAAA a
INNER JOIN MyNS.GetUnshippedOrders() b ON a.z=b.z
它将产生很少的开销,并且运行良好。
如果你试图在类似的查询中使用你的多语句表值,你会有性能问题:
SELECT
x.a,x.b,x.c,(SELECT OrderQty FROM MyNS.GetLastShipped(x.CustomerID)) AS Qty
FROM xxxx x
由于您将对返回的每一行执行函数1次,因此随着结果集变得越来越大,它将运行得越来越慢。
In researching Matt's comment, I have revised my original statement. He is correct, there will be a difference in performance between an inline table valued function (ITVF) and a multi-statement table valued function (MSTVF) even if they both simply execute a SELECT statement. SQL Server will treat an ITVF somewhat like a VIEW in that it will calculate an execution plan using the latest statistics on the tables in question. A MSTVF is equivalent to stuffing the entire contents of your SELECT statement into a table variable and then joining to that. Thus, the compiler cannot use any table statistics on the tables in the MSTVF. So, all things being equal, (which they rarely are), the ITVF will perform better than the MSTVF. In my tests, the performance difference in completion time was negligible however from a statistics standpoint, it was noticeable.
在你的例子中,这两个函数在函数上不相等。MSTV函数每次被调用时都执行一个额外的查询,最重要的是,它对客户id进行过滤。在大型查询中,优化器将无法利用其他类型的连接,因为它需要为传递的每个customerId调用函数。然而,如果你重写你的MSTV函数像这样:
CREATE FUNCTION MyNS.GetLastShipped()
RETURNS @CustomerOrder TABLE
(
SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL
)
AS
BEGIN
INSERT @CustomerOrder
SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
FROM Sales.SalesOrderHeader a
INNER JOIN Sales.SalesOrderHeader b
ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c
ON b.ProductID = c.ProductID
WHERE a.OrderDate = (
Select Max(SH1.OrderDate)
FROM Sales.SalesOrderHeader As SH1
WHERE SH1.CustomerID = A.CustomerId
)
RETURN
END
GO
在查询中,优化器可以调用该函数一次,并构建更好的执行计划,但它仍然不会比等效的、非参数化的ITVS或VIEW更好。
在可行的情况下,ITVF应该优先于mstvf,因为数据类型、可空性和排序规则来自表中的列,而您在多语句表值函数中声明这些属性,重要的是,您将从ITVF获得更好的执行计划。根据我的经验,我没有发现很多情况下,ITVF是一个比VIEW更好的选择,但里程可能会有所不同。
多亏了马特。
除了
因为我最近看到了这个问题,下面是Wayne Sheffield做的一个很好的分析,比较了内联表值函数和多语句函数之间的性能差异。
他的博客原文。
复制到SQL Server Central
在内部,SQL Server像对待视图一样对待内联表值函数,像对待存储过程一样对待多语句表值函数。
当内联表值函数作为外部查询的一部分使用时,查询处理器将扩展UDF定义,并使用这些对象上的索引生成访问底层对象的执行计划。
For a multi-statement table valued function, an execution plan is created for the function itself and stored in the execution plan cache (once the function has been executed the first time). If multi-statement table valued functions are used as part of larger queries then the optimiser does not know what the function returns, and so makes some standard assumptions - in effect it assumes that the function will return a single row, and that the returns of the function will be accessed by using a table scan against a table with a single row.
当多语句表值函数返回大量行并在外部查询中连接时,它们的性能可能会很差。性能问题主要是由于这样一个事实:优化器将生成一个假设返回一行的计划,而这个计划不一定是最合适的计划。
根据经验,我们发现,由于这些潜在的性能问题,在可能的情况下,应该优先使用内联表值函数而不是多语句函数(当UDF将被用作外部查询的一部分时)。
使用多行函数的另一种情况是避免sql server下推where子句。
For example, I have a table with a table names and some table names are formatted like C05_2019 and C12_2018 and and all tables formatted that way have the same schema. I wanted to merge all that data into one table and parse out 05 and 12 to a CompNo column and 2018,2019 into a year column. However, there are other tables like ACA_StupidTable which I cannot extract CompNo and CompYr and would get a conversion error if I tried. So, my query was in two part, an inner query that returned only tables formatted like 'C_______' then the outer query did a sub-string and int conversion. ie Cast(Substring(2, 2) as int) as CompNo. All looks good except that sql server decided to put my Cast function before the results were filtered and so I get a mind scrambling conversion error. A multi statement table function may prevent that from happening, since it is basically a "new" table.