Подія
Отримайте сертифікацію в Microsoft Fabric – безкоштовно!
19 лист., 23 - 10 груд., 23
Протягом обмеженого часу команда спільноти Microsoft Fabric пропонує безкоштовні ваучери на іспити DP-600.
Підготуйтеся заразЦей браузер більше не підтримується.
Замініть його на Microsoft Edge, щоб користуватися перевагами найновіших функцій, оновлень безпеки та технічної підтримки.
Applies to: SQL Server 2019 (15.x) Azure SQL Database Azure SQL Managed Instance
This article introduces scalar UDF inlining, a feature under the Intelligent query processing in SQL databases suite of features. This feature improves the performance of queries that invoke scalar UDFs in SQL Server 2019 (15.x) and later versions.
User-Defined Functions (UDFs) that are implemented in Transact-SQL and return a single data value are referred to as T-SQL Scalar User-Defined Functions. T-SQL UDFs are an elegant way to achieve code reuse and modularity across Transact-SQL queries. Some computations (such as complex business rules) are easier to express in imperative UDF form. UDFs help in building up complex logic without requiring expertise in writing complex SQL queries. For more information about UDFs, see Create user-defined functions (Database Engine).
Scalar UDFs typically end up performing poorly due to the following reasons:
Iterative invocation. UDFs are invoked in an iterative manner, once per qualifying tuple. This incurs an extra cost of repeated context switching due to function invocation. Especially, UDFs that execute Transact-SQL queries in their definition are severely affected.
Lack of costing. During optimization, only relational operators are costed, while scalar operators aren't. Before the introduction of scalar UDFs, other scalar operators were generally cheap and didn't require costing. A small CPU cost added for a scalar operation was enough. There are scenarios where the actual cost is significant, and yet still remains underrepresented.
Interpreted execution. UDFs are evaluated as a batch of statements, executed statement-by-statement. Each statement itself is compiled, and the compiled plan is cached. Although this caching strategy saves some time as it avoids recompilations, each statement executes in isolation. No cross-statement optimizations are carried out.
Serial execution. SQL Server doesn't allow intra-query parallelism in queries that invoke UDFs.
The goal of the scalar UDF inlining feature is to improve performance of queries that invoke T-SQL scalar UDFs, where UDF execution is the main bottleneck.
With this new feature, scalar UDFs are automatically transformed into scalar expressions or scalar subqueries that are substituted in the calling query in place of the UDF operator. These expressions and subqueries are then optimized. As a result, the query plan no longer has a user-defined function operator, but its effects are observed in the plan, like views or inline table-valued functions (TVFs).
The examples in this section use the TPC-H benchmark database. For more information, see the TPC-H Homepage.
Consider the following query.
SELECT L_SHIPDATE,
O_SHIPPRIORITY,
SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT))
FROM LINEITEM
INNER JOIN ORDERS
ON O_ORDERKEY = L_ORDERKEY
GROUP BY L_SHIPDATE, O_SHIPPRIORITY
ORDER BY L_SHIPDATE;
This query computes the sum of discounted prices for line items and presents the results grouped by the shipping date and shipping priority. The expression L_EXTENDEDPRICE *(1 - L_DISCOUNT)
is the formula for the discounted price for a given line item. Such formulas can be extracted into functions for the benefit of modularity and reuse.
CREATE FUNCTION dbo.discount_price
(
@price DECIMAL (12, 2),
@discount DECIMAL (12, 2)
)
RETURNS DECIMAL (12, 2)
AS
BEGIN
RETURN @price * (1 - @discount);
END
Now the query can be modified to invoke this UDF.
SELECT L_SHIPDATE,
O_SHIPPRIORITY,
SUM(dbo.discount_price(L_EXTENDEDPRICE, L_DISCOUNT))
FROM LINEITEM
INNER JOIN ORDERS
ON O_ORDERKEY = L_ORDERKEY
GROUP BY L_SHIPDATE, O_SHIPPRIORITY
ORDER BY L_SHIPDATE;
The query with the UDF performs poorly, due to the reasons outlined previously. With scalar UDF inlining, the scalar expression in the body of the UDF is substituted directly in the query. The results of running this query are shown in the following table:
Query: | Query without UDF | Query with UDF (without inlining) | Query with scalar UDF inlining |
---|---|---|---|
Execution time: | 1.6 seconds | 29 minutes 11 seconds | 1.6 seconds |
These numbers are based on a 10-GB CCI database (using the TPC-H schema), running on a machine with dual processor (12 core), 96-GB RAM, backed by SSD. The numbers include compilation and execution time with a cold procedure cache and buffer pool. The default configuration was used, and no other indexes were created.
Scalar UDFs that are implemented using multiple T-SQL statements such as variable assignments and conditional branching can also be inlined. Consider the following scalar UDF that, given a customer key, determines the service category for that customer. It arrives at the category by first computing the total price of all orders placed by the customer using a SQL query. Then, it uses an IF (...) ELSE
logic to decide the category based on the total price.
CREATE OR ALTER FUNCTION dbo.customer_category (@ckey INT)
RETURNS CHAR (10)
AS
BEGIN
DECLARE @total_price AS DECIMAL (18, 2);
DECLARE @category AS CHAR (10);
SELECT @total_price = SUM(O_TOTALPRICE)
FROM ORDERS
WHERE O_CUSTKEY = @ckey;
IF @total_price < 500000
SET @category = 'REGULAR';
ELSE
IF @total_price < 1000000
SET @category = 'GOLD';
ELSE
SET @category = 'PLATINUM';
RETURN @category;
END
Now, consider a query that invokes this UDF.
SELECT C_NAME,
dbo.customer_category(C_CUSTKEY)
FROM CUSTOMER;
The execution plan for this query in SQL Server 2017 (14.x) (compatibility level 140 and earlier) is as follows:
As the plan shows, SQL Server adopts a simple strategy here: for every tuple in the CUSTOMER
table, invoke the UDF and output the results. This strategy is naive and inefficient. With inlining, such UDFs are transformed into equivalent scalar subqueries, which are substituted in the calling query in place of the UDF.
For the same query, the plan with the UDF inlined looks as follows.
As mentioned earlier, the query plan no longer has a user-defined function operator, but its effects are now observable in the plan, like views or inline TVFs. Here are some key observations from the previous plan:
SQL Server infers the implicit join between CUSTOMER
and ORDERS
and makes it explicit via a join operator.
SQL Server also infers the implicit GROUP BY O_CUSTKEY on ORDERS
and uses the IndexSpool + StreamAggregate to implement it.
SQL Server is now using parallelism across all operators.
Depending upon the complexity of the logic in the UDF, the resulting query plan might also get bigger and more complex. As we can see, the operations inside the UDF are now no longer opaque, and so the query optimizer is able to cost and optimize those operations. Also, since the UDF is no longer in the plan, iterative UDF invocation is replaced by a plan that completely avoids function call overhead.
A scalar T-SQL UDF can be inlined if the function definition uses allowed constructs, and the function is used in a context that enables inlining:
All of the following conditions of the UDF definition must be true:
DECLARE
, SET
: Variable declaration and assignments.SELECT
: SQL query with single/multiple variable assignments 1.IF
/ELSE
: Branching with arbitrary levels of nesting.RETURN
: Single or multiple return statements. Starting with SQL Server 2019 (15.x) CU5, the UDF can only contain a single RETURN statement to be considered for inlining 6.UDF
: Nested/recursive function calls 2.EXISTS
, IS NULL
.GETDATE()
) or has side effects 3 (such as NEWSEQUENTIALID()
).EXECUTE AS CALLER
clause (default behavior if the EXECUTE AS
clause isn't specified).@@ROWCOUNT
) 4.OBJECT_ID
) 4.ORDER BY
without a TOP 1
clause 5.ORDER BY
clause (such as SELECT @x = @x + 1 FROM table1 ORDER BY col1
) 5.STRING_AGG
function 6.WITH XMLNAMESPACES
8.1 SELECT
with variable accumulation/aggregation isn't supported for inlining (such as SELECT @val += col1 FROM table1
).
2 Recursive UDFs are inlined to a certain depth only.
3 Intrinsic functions whose results depend upon the current system time are time-dependent. An intrinsic function that might update some internal global state is an example of a function with side effects. Such functions return different results each time they're called, based on the internal state.
4 Restriction added in SQL Server 2019 (15.x) CU 2
5 Restriction added in SQL Server 2019 (15.x) CU 4
6 Restriction added in SQL Server 2019 (15.x) CU 5
7 Restriction added in SQL Server 2019 (15.x) CU 6
8 Restriction added in SQL Server 2019 (15.x) CU 11
9 Because signatures could be added and dropped after a UDF is created, the decision whether to inline is done when the query referencing a scalar UDF is compiled. For example, system functions are typically signed with a certificate. You can use sys.crypt_properties to find which objects are signed.
All of the following requirement of the execution context must be true:
ORDER BY
clause.GROUP BY
clause.DISTINCT
clause doesn't have an ORDER BY
clause.GROUPING SETS
, CUBE
, or ROLLUP
2.SELECT @y = 2
, @x = UDF(@y)
) 2.1 Restriction added in SQL Server 2019 (15.x) CU 5
2 Restriction added in SQL Server 2019 (15.x) CU 6
3 Restriction added in SQL Server 2019 (15.x) CU 11
For information on the latest T-SQL scalar UDF inlining fixes and changes to inlining eligibility scenarios, see the Knowledge Base article: FIX: scalar UDF inlining issues in SQL Server 2019.
For every T-SQL scalar UDF, the sys.sql_modules catalog view includes a property called is_inlineable
, which indicates whether a UDF is inlineable.
The is_inlineable
property is derived from the constructs found inside the UDF definition. It doesn't check whether the UDF is in fact inlineable at compile time. For more information, see the conditions for inlining.
A value of 1
indicates that the UDF is inlineable, and 0
indicates otherwise. This property has a value of 1
for all inline TVFs as well. For all other modules, the value is 0
.
If a scalar UDF is inlineable, it doesn't imply that it's always inlined. SQL Server decides (on a per-query, per-UDF basis) whether to inline a UDF. Refer to the lists of requirements earlier in this article.
SELECT *
FROM sys.crypt_properties AS cp
INNER JOIN sys.objects AS o
ON cp.major_id = o.object_id;
If all the preconditions are satisfied and SQL Server decides to perform inlining, it transforms the UDF into a relational expression. From the query plan, you can figure out whether inlining occurred:
<UserDefinedFunction>
XML node for a UDF that is inlined successfully.You can make workloads automatically eligible for scalar UDF inlining by enabling compatibility level 150 for the database. You can set this using Transact-SQL. For example:
ALTER DATABASE [WideWorldImportersDW]
SET COMPATIBILITY_LEVEL = 150;
Apart from this step, there are no other changes required to be made to UDFs or queries to take advantage of this feature.
Scalar UDF inlining can be disabled at the database, statement, or UDF scope while still maintaining database compatibility level 150 and higher. To disable scalar UDF inlining at the database scope, execute the following statement within the context of the applicable database:
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = OFF;
To re-enable scalar UDF inlining for the database, execute the following statement within the context of the applicable database:
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON;
When ON
, this setting appears as enabled in sys.database_scoped_configurations.
You can also disable scalar UDF inlining for a specific query by designating DISABLE_TSQL_SCALAR_UDF_INLINING
as a USE HINT
query hint.
A USE HINT
query hint takes precedence over the database scoped configuration or compatibility level setting.
For example:
SELECT L_SHIPDATE,
O_SHIPPRIORITY,
SUM(dbo.discount_price(L_EXTENDEDPRICE, L_DISCOUNT))
FROM LINEITEM
INNER JOIN ORDERS
ON O_ORDERKEY = L_ORDERKEY
GROUP BY L_SHIPDATE, O_SHIPPRIORITY
ORDER BY L_SHIPDATE
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
Scalar UDF inlining can also be disabled for a specific UDF using the INLINE clause in the CREATE FUNCTION
or ALTER FUNCTION
statement.
For example:
CREATE OR ALTER FUNCTION dbo.discount_price
(
@price DECIMAL (12, 2),
@discount DECIMAL (12, 2)
)
RETURNS DECIMAL (12, 2)
WITH INLINE = OFF
AS
BEGIN
RETURN @price * (1 - @discount);
END
Once the previous statement is executed, this UDF is never inlined into any query that invokes it. To re-enable inlining for this UDF, execute the following statement:
CREATE OR ALTER FUNCTION dbo.discount_price
(
@price DECIMAL (12, 2),
@discount DECIMAL (12, 2)
)
RETURNS DECIMAL (12, 2)
WITH INLINE = ON
AS
BEGIN
RETURN @price * (1 - @discount);
END
The INLINE
clause isn't mandatory. If the INLINE
clause isn't specified, it's automatically set to ON
/OFF
based on whether the UDF can be inlined. If INLINE = ON
is specified but the UDF is found ineligible for inlining, an error is thrown.
As described in this article, scalar UDF inlining transforms a query with scalar UDFs into a query with an equivalent scalar subquery. Because of this transformation, you might notice some differences in behavior in the following scenarios:
Inlining results in a different query hash for the same query text.
Certain warnings in statements inside the UDF (such as divide by zero, etc.) that might be previously hidden, can show up due to inlining.
Query level join hints might not be valid anymore, as inlining can introduce new joins. Local join hints must be used instead.
Views that reference inline scalar UDFs can't be indexed. If you need to create an index on such views, disable inlining for the referenced UDFs.
There might be some differences in the behavior of Dynamic data masking with UDF inlining.
In certain situations (depending upon the logic in the UDF), inlining might be more conservative with respect to masking output columns. In scenarios where the columns referenced in a UDF aren't output columns, they aren't masked.
If a UDF references built-in functions such as SCOPE_IDENTITY()
, @@ROWCOUNT
, or @@ERROR
, the value returned by the built-in function changes with inlining. This change in behavior is because inlining changes the scope of statements inside the UDF. Starting with SQL Server 2019 (15.x) CU2, inlining is blocked if the UDF references certain intrinsic functions (for example @@ROWCOUNT
).
If a variable is assigned with the result of an inlined UDF and it also used as index_column_name
in FORCESEEK
Query hints, it results in error 8622, indicating that the query processor couldn't produce a query plan because of the hints defined in the query.
Подія
Отримайте сертифікацію в Microsoft Fabric – безкоштовно!
19 лист., 23 - 10 груд., 23
Протягом обмеженого часу команда спільноти Microsoft Fabric пропонує безкоштовні ваучери на іспити DP-600.
Підготуйтеся заразНавчання
Модуль
Introduction to SQL Server 2022 T-SQL enhancements - Training
SQL Server 2022 introduces a range of new features and enhancements, providing powerful tools and capabilities that optimize performance and offer better control on SQL objects manipulation and creation.