How Fast is FAST?

When listening to an online presentation a few days ago, the speaker suggested using one of my favorite, underused query hints: FAST N. This replaces an older table hint called FASTFIRSTROW, which is one of the few deprecated features to actually be removed from SQL Server. The speaker suggested using the FAST 1 hint, because ‘sometimes it improves the query’, but didn’t say any more about how or why it would improve the query, and why you might NOT want to use it. So, I’m going to tell you what this hint does.

When I first heard about the FASTFIRSTROW hint, years ago, it was described as a way to force SQL Server to use a nonclustered index to avoid a sort. It, and its descendent FAST N, can certainly have that result but that’s not what they actually “do”. I going to assume you’re relatively familiar with the way SQL Server nonclustered indexes are organized. There is a leaf level that contains index rows for all the index keys, including all duplicate values. The rows are sorted by the index key. Each index row has a pointer of some sort to the actual row of data in the table. (The rest of the details are out of scope for this post.)

So suppose you have a nonclustered index on first name. The leaf level would contain all the first name values in order: Ann, Anne, Bjorn, Bob, Boris, Charlie, Chuck, Dan, Dmitry, Elena, etc. If there were multiple table rows with the first name Boris, there would be multiple index rows in the leaf level of the nonclustered index with Boris.

Now suppose someone runs a query like this:

SELECT * FROM customers
ORDER BY FirstName;

How could this query be satisfied? There are two ways. One way would be to read all the rows from the table, and then sort them by the FirstName column. The second way would be to scan the leaf level of the nonclustered index, in which the names are already sorted, and for each row, follow the pointer to the row in the table. The first solution is faster in general as SQL Server does have some very efficient internal sorting algorithms. Following a pointer for every single row in the table is usually much too expensive. But there is a case where the second solution might be advantageous. The SORT operator is considered a ‘blocking operator’ in a query plan because it has to consume all the input, and complete the sorting, before any output can be produced. Another name for a blocking operator is “stop-and-go”. The alternative to a blocking operator is sometimes called a “flow-through” operator. These flow-through operators read rows, process them, and either send them on to the next operator, or not, if they didn’t meet the right conditions. So they are basically consuming rows and producing rows concurrently, and the rows are flowing through the operator. For more information about blocking vs non-blocking operators, see Craig Freedman’s excellent post here: https://blogs.msdn.microsoft.com/craigfr/2006/06/19/properties-of-iterators/

The important fact here is that NO output can be produced until all the input is consumed and the SORT is completed.

If you have an application that can display values as soon as SQL Server returns them, it can potentially be useful to start returning something as soon as possible. How do you measure the performance of your queries? How does an end-user interpret the performance? Is it the total time to complete a query, or is the time to get any input back (i.e. “time to first row”). If SQL Server could follow the leaf level of the nonclustered index, it would have the first value in sorted order immediately. And the next few values could be returned quickly, too. The total time to return ALL the rows might be slower than doing the sort, as SQL Server followed the pointer from every single index row, but there would be no waiting to start getting output from the query. This is where the hint FASTFIRSTROW came in handy. It told SQL Server to come up with a plan that would return the first row as quickly as possible, no matter how much time it took for ALL the rows. And frequently, this meant using a nonclustered index to access all the rows in the table.

The FASTFIRSTROW table hint was deprecated and replaced by the FAST N query hint. This hint is similar, but is more flexible. With this new HINT, you are basically telling the optimizer to assume only N rows will be returned. In most cases this means the optimize will assume a rowcount of N for one or more operators in the plan. If N is one, this is synonymous with FASTFIRSTROW. So is a query with the FAST hint faster than not having it? Should you use it?

The only way to tell is to test it, of course.

But there is a caveat I want to point out. If you want to see if using the FAST hint is faster, you can’t rely on your estimated execution plans. In the AdventureWorks2014 database, there is a nonclustered index on the ProductID column in the Sales.SalesOrderDetail table. Look at the estimated plans for two queries that sort by ProductID. The second one uses the FAST 1 hint:

SELECT * FROM Sales.SalesOrderDetail
ORDER BY ProductID;
GO

SELECT * FROM Sales.SalesOrderDetail
ORDER BY ProductID
OPTION (FAST 1);
GO

I get the plan shown here:

If I look at the Query Cost relative to the batch, it looks like the query with the hint is infinitely faster than the first query without the hint (0% compared to 100%). But this is obviously not the case. If I look at the details of the index scan operator in the second plan, I see that the way the optimizer determined the best plan for returning one row was to assume there is only one row in the table. The Estimated number of rows for the Index Scan in the second query is 1! So the cost comparison is saying that if you had a query that was only sorting one row, that it would be almost infinitely faster than a query that needed to sort 121,000 rows. But this is only the estimated plan. When the queries are executed, all 121,000 rows will be processed by both of them, and it will usually turn out that the second one is not as fast.

Even though using a hint called FAST does sound appealing, it is not a magic button. In some cases, it can cause the optimizer to come up with a better or faster plan, and in some cases, it will do just the opposite.

Blog: How Fast is FAST?

Related posts

Getting started