Subscribe to Windows IT Pro
January 26, 2012 11:18 AM

SQL Server 2012: How to Write T-SQL Windows Functions, Part 3

Optimization for improved performance
SQL Server Pro
InstantDoc ID #141036
Rating: (1)

This article is the last in a three-part series about SQL Server 2012's enhanced window functions. In the first two articles in the series, I focused on the logical aspects of window functions. I described window aggregate, offset, and distribution functions. In this final article, I focus on optimization of window functions. I provide indexing guidelines for optimal performance, and I describe cases that fall under the "fast-track" category, with especially optimal treatment; cases that compute the difference between two cumulative values; cases that expand all frame rows; and cases in which the optimizer can use an in-memory spool versus an on-disk spool.

As a reminder, you need to use SQL Server 2012 (CTP3 or later) to run the sample code from this series. Click the link to  download SQL Server 2012 from .

For the purposes of this article, I use bank account transactions as sample data. To create the sample data, you need a helper table called GetNums that accepts two integers as inputs and returns a sequence of integers in the requested range. Run the code in Listing 1 to create the GetNums function in the tempdb database (for test purposes). Run the code in Listing 2 to create the Accounts and Transactions tables and fill them with sample data (100 accounts ´ 20,000 transactions in each account).

Let's start our discussion of window function optimization with indexing guidelines.

Indexing Guidelines

To compute the enhanced aggregate window functions, the optimizer needs the rows sorted first by the window partitioning elements, then by the window ordering elements. Also, the index needs to include in its leaf level all the rest of the columns in the query for coverage purposes. If the rows are ordered as needed and covered by an index, the optimizer can scan the index in order, use a Segment operator to handle one partition at a time, a Window Spool operator to expand the applicable frame rows per source row, and a Stream Aggregate operator to perform the actual aggregate calculations.

A simple way to remember what kind of index you need to create is to use the acronym POC for Partitioning, Ordering, and Coverage. The P and O columns need to be defined as the index key elements. The C columns need to be included in the index leaf, using an INCLUDE clause in a nonclustered index. (If the index is clustered, everything is implicitly covered.) In the absence of a POC index, the optimizer must sort the rows based on the P and O elements -- and sorting a large set can be expensive.

As an example, supposed that you need to compute bank account balances at each transaction point. This task can be achieved with a running sum of the transaction values, partitioned by the account ID and ordered by the transaction ID. A POC index already exists; that's the clustered index defined by the primary key constraint PK_Transactions in Listing 2 as part of the definition of the Transactions table.

Here's the query computing the bank account balances:

SELECT actid, tranid, val,

SUM(val) OVER(PARTITION BY actid

ORDER BY tranid

ROWS UNBOUNDED PRECEDING) AS runval

FROM dbo.Transactions; 

Figure 1 shows the plan for this query. Observe in the plan that the clustered index PK_Transactions is scanned in an ordered fashion, and therefore no sorting is required.

Figure 1: Plan with POC index
Figure 1: Plan with POC index
 

To contrast, the following query computes a window function that doesn't have a POC index to support it:

SELECT actid, tranid, val,

SUM(val) OVER(PARTITION BY actid

ORDER BY val

ROWS UNBOUNDED PRECEDING) AS runval

FROM dbo.Transactions; 

Figure 2 shows the plan for this query. Observe that in this plan the optimizer added a Sort operator that sorts the data by the P (actid) and O (val) elements.

Figure 2: Plan without POC index
Figure 2: Plan without POC index
 

Fast Track

Conceptually, for each row in the underlying query, a window function defines a frame of rows within the current window partition and applies the calculation against this frame to produce a scalar outcome. For example, consider a frame such as ROWS BETWEEN 11 PRECEDING AND CURRENT ROW. For each row, the window frame will have up to 12 applicable rows. SQL Server 2012 introduces a Window Spool operator that's responsible for expanding for each row the applicable frame of rows; then, a Stream Aggregate operator aggregates those rows to produce the scalar result. In the worst-case scenario, the Window Spool operator expands all frame rows. I discuss such cases later in the article. However, the optimizer can identify special cases and avoid such complete expansion.

The best-case scenario is what I refer to as the fast-track case. This case is applicable when the top (i.e., first) delimiter of the frame is UNBOUNDED PRECEDING. With such a top delimiter, conceptually, the number of rows per frame should keep increasing by one for each underlying row; so the total number of rows in all frames in the partition increases in a quadratic manner with respect to the number of underlying rows. But that's only conceptually. In practice, the optimizer keeps just two rows in each frame -- one with the aggregated values so far (apply the calculation between the previously aggregated values and the current value) and the current row (needed to return the detail elements). That's a huge savings!

Here's an example of a query that falls under the fast-track category because it uses UNBOUNDED PRECEDING as the top delimiter of the frame:

SELECT actid, tranid, val,

SUM(val) OVER(PARTITION BY actid

ORDER BY tranid

ROWS UNBOUNDED PRECEDING) AS runval

FROM dbo.Transactions; 

This query ran for 7 seconds on my system. Figure 3 shows the execution plan for the query.

Figure 3: Plan for fast-track case
Figure 3: Plan for fast-track case
 

 

Related Content:

ARTICLE TOOLS

Comments
  • David
    1 month ago
    Apr 12, 2012

    So, if I have a bunch of existing queries that use the "old fashioned" sub-query approach to compute, say, a running total, can I expect to see an optimal query-plan using the "window spool" operator (assuming I have the correct indexes defined)?

You must log on before posting a comment.

Are you a new visitor? Register Here

advertisement

advertisement

Windows is a trademark of the Microsoft group of companies. Windows IT Pro is used by Penton Media Inc. under license from owner.