Words: 1344
Time to read: ~ 7 minutes
Welcome to the April edition of T-SQL Tuesday. This month we have Hamish Watson ( Blog | Twitter ) asking us about how we unit test our databases.
Performance Tuning
At one stage or another, I’m sure that we’ve all been asked to performance tune database code. Sometimes this tuning can involve changing the code. How do we know that the changes that we are introducing isn’t going to introduce bugs? You got it – Unit tests!
Set up
Let’s set up a contrived example – Contrived of course coming from the words “con” as in fake and “Tried V” as in I tried 5 times before I just gave up and used what I had.
USE [tempdb];
GO
/*
Let's build this thang...
*/
CREATE TABLE [dbo].[Objects]
(
[id] int NOT NULL CONSTRAINT [PK dbo.Objects(id)] PRIMARY KEY,
[level] int NOT NULL,
[object_type] int NOT NULL,
[tag] nvarchar(256) NOT NULL,
[description] nvarchar(max) NOT NULL,
[parent_object_id] int NULL
);
GO
INSERT INTO [dbo].[Objects]
(
[id],
[level],
[object_type],
[tag],
[description],
[parent_object_id]
)
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS [id],
NTILE(10000) OVER (ORDER BY [S01].[alias]) AS [level],
[M01].[severity] AS [object_type],
[S01].[alias] AS [tag],
[M01].[text] AS [description],
NULLIF(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1, 0) AS [parent_object_id]
FROM [sys].[messages] AS [M01]
INNER JOIN [sys].[syslanguages] AS [S01] ON [M01].[language_id] = [S01].[lcid]
WHERE [S01].[alias] = N'English';
GO
SELECT *
FROM [dbo].[Objects];
GO
This will insert 11,466 records into the dbo.Objects
table. The table has a column called parent_object_id
that points to its … parent object id, wow you have been paying attention! Luckily, or unluckily, tuning the schema of the table is out of scope for this post
Next we have the piece of code that we’ve been asked to performance tune. It’s …uhm … well it works.
/*
RBAR SQL
*/
CREATE FUNCTION [dbo].[GetChildObjects]
(
@ObjectId int
)
RETURNS @ChildObjects TABLE
(
[object_id] int NOT NULL,
[level] int NOT NULL,
[object_type] int NOT NULL,
[tag] nvarchar(256) NOT NULL,
[description] nvarchar(max) NOT NULL
)
AS
BEGIN
--
-- Insert first object.
--
INSERT INTO @ChildObjects
SELECT
id,
level,
object_type,
tag,
description
FROM dbo.Objects
WHERE id = @ObjectId;
--
-- Stack
--
DECLARE @Stack TABLE
(
ObjectId int NOT NULL,
lvl int NOT NULL
);
--
-- Initialise level for stack
--
DECLARE @lvl int = 1,
@CurrentObjectId INT = NULL;
--
-- Insert current object
--
INSERT INTO @Stack
(
ObjectId,
lvl
)
VALUES
(@ObjectId, @lvl);
--
-- Loop through stack and get child objects
--
WHILE (@lvl > 0)
BEGIN
IF EXISTS (SELECT * FROM @Stack WHERE lvl = @lvl)
BEGIN
--
-- Get object
--
SELECT @CurrentObjectId = ObjectId
FROM @Stack
WHERE lvl = @lvl;
--
-- Insert child objects
--
INSERT INTO @ChildObjects
SELECT
id,
level,
object_type,
tag,
description
FROM dbo.Objects
WHERE parent_object_id = @CurrentObjectId;
--
-- Delete object from stack
--
DELETE FROM @Stack
WHERE lvl = @lvl
AND ObjectId = @CurrentObjectId;
--
-- Insert the childnodes of the current node into the stack
--
INSERT INTO @Stack
(
ObjectId,
lvl
)
SELECT
id,
@lvl + 1
FROM dbo.Objects
WHERE parent_object_id = @CurrentObjectId;
IF (@@ROWCOUNT > 0) -- If the previous statement added one or more nodes, go down for its first child.
SET @lvl = @lvl + 1; -- If no nodes are added, check its sibling nodes
END;
ELSE
SET @lvl = @lvl - 1; -- Back to the level immediately above
END;
RETURN;
END;
GO
Timing is important
Let’s get some times for this function. There’s nothing fancy here, we’re just going to call it for different values, returning more and more results and see how long it takes to finish.
/*
Simple tezt...
*/
SELECT * FROM [dbo].[GetChildObjects](11466) OPTION (RECOMPILE); -- 0 seconds.
SELECT * FROM [dbo].[GetChildObjects](11000) OPTION (RECOMPILE); -- 3 seconds.
SELECT * FROM [dbo].[GetChildObjects](10000) OPTION (RECOMPILE); -- 10 seconds.
SELECT * FROM [dbo].[GetChildObjects](9000) OPTION (RECOMPILE); -- 17 seconds.
SELECT * FROM [dbo].[GetChildObjects](8000) OPTION (RECOMPILE); -- 23 seconds.
SELECT * FROM [dbo].[GetChildObjects](7000) OPTION (RECOMPILE); -- 29 seconds.
SELECT * FROM [dbo].[GetChildObjects](6000) OPTION (RECOMPILE); -- 36 seconds.
SELECT * FROM [dbo].[GetChildObjects](5000) OPTION (RECOMPILE); -- 41 seconds.
SELECT * FROM [dbo].[GetChildObjects](4000) OPTION (RECOMPILE); -- 48 seconds.
SELECT * FROM [dbo].[GetChildObjects](3000) OPTION (RECOMPILE); -- 54 seconds.
SELECT * FROM [dbo].[GetChildObjects](2000) OPTION (RECOMPILE); -- 61 seconds.
SELECT * FROM [dbo].[GetChildObjects](1000) OPTION (RECOMPILE); -- 68 seconds.
SELECT * FROM [dbo].[GetChildObjects](10) OPTION (RECOMPILE); -- 76 seconds.
Taking from 0 seconds for 1 row to 76 seconds for 11,456 rows, we now our timings.
As you can see there has been some performance tuning efforts by the code creators already. So impressed was I by the use of OPTION (RECOMPILE)
that I asked why they asked me to take a look.
The execution plan seems fine; doesn’t look like there’s anything we can do with it…
Well, let’s take a look at the execution plan, shall we?

No surprises
Listen, we both know that this needs to be rewritten. Here is the crux of the post – how do we know that our re-write isn’t going to introduce unexpected side-effects? Little happy “unspecified-features“? You got it – Unit tests!
Creating Unit Tests
To keep this brief, I’ve installed tSQLt on my tempdb database. The documentation for installing it is very easy to follow on the tSQLt website so head there for instructions. We’re going to create some tests for this function.
/*
tSQLt time!
*/
EXECUTE [tSQLt].[NewTestClass] @ClassName = N'GetChildObjects';
GO
CREATE PROCEDURE [GetChildObjects].[test returns nothing for nothing]
AS
BEGIN
-- Arrange (new "fake" table).
EXECUTE [tSQLt].[FakeTable] @TableName = N'[dbo].[Objects]';
-- Act
SELECT
*
INTO #Actual
FROM [dbo].[GetChildObjects](1);
-- Assert (empty is empty).
EXECUTE [tSQLt].[AssertEmptyTable] @TableName = N'#Actual'
END;
GO
CREATE PROCEDURE [GetChildObjects].[test returns 1 row if no child objects]
AS
BEGIN
-- Arrange (new "fake" table).
EXECUTE [tSQLt].[FakeTable] @TableName = N'[dbo].[Objects]';
INSERT INTO [dbo].[Objects]
(
[id],
[level],
[object_type],
[tag],
[description],
[parent_object_id]
)
VALUES
(
1,
1,
1,
N'Don''t care',
N'Absolutely don''t care',
NULL
);
-- Act
SELECT
*
INTO #Actual
FROM [dbo].[GetChildObjects](1);
SELECT
1 AS [object_id],
1 AS [level],
1 AS [object_type],
N'Don''t care' AS [tag],
N'Absolutely don''t care' AS [description]
INTO #Expected
EXECUTE [tSQLt].[AssertEqualsTable] @Expected = N'#Expected', @Actual = N'#Actual';
END;
GO
CREATE PROCEDURE [GetChildObjects].[test returns multiple rows...say 4]
AS
BEGIN
-- Arrange (new "fake" table).
EXECUTE [tSQLt].[FakeTable] @TableName = N'[dbo].[Objects]';
INSERT INTO [dbo].[Objects]
(
[id],
[level],
[object_type],
[tag],
[description],
[parent_object_id]
)
SELECT
[V].[number],
[V].[number],
[V].[number],
N'Don''t care '+ CONVERT(varchar(2), [V].[number]),
N'Absolutely don''t care '+ CONVERT(varchar(2), [V].[number]),
NULLIF([V].[number] -1, 0)
FROM [master].[dbo].[spt_values] AS [V]
WHERE [V].[type] = 'P'
AND [V].[number] > 0
AND [V].[number] <= 10
ORDER BY [V].[number]
-- Act
SELECT
*
INTO #Actual
FROM [dbo].[GetChildObjects](7);
SELECT
[id] AS [object_id],
[level],
[object_type],
[tag],
[description]
INTO #Expected
FROM [dbo].[Objects]
WHERE [id] >= 7
ORDER BY [id];
EXECUTE [tSQLt].[AssertEqualsTable] @Expected = N'#Expected', @Actual = N'#Actual';
END;
GO
Now we have tests to see what happens when the table is empty, when only 1 row should be returned, and when multiple rows are supposed to be returned. Hey! 4 rows still counts as multiple!
Running these tests couldn’t be simpler:
EXECUTE [tSQLt].[Run] @TestName = N'GetChildObjects';
GO

Re-Write time
The best piece of advice I heard for performance tuning, especially with a hectic schedule, is to
- Have a “good enough” goal and have a “time limit”
If the code has to run under a minute and you get it to 51 seconds after 30 minutes, great! Stop – I’m sure you’ve a to-do list that is only getting bigger.
/*
All your bases are belonging to SET!!!
*/
DROP FUNCTION [dbo].[GetChildObjects]; --Don't worry, it's in source control.
GO
CREATE FUNCTION [dbo].[GetChildObjects]
(
@ObjectId int
)
RETURNS TABLE
AS
RETURN
(
WITH [Stack] AS
(
SELECT
[O].[id] AS [object_id],
[O].[parent_object_id]
FROM [dbo].[Objects] AS [O]
WHERE [O].[id] = @ObjectId
UNION ALL
SELECT
[Obj].[id],
[Obj].[parent_object_id]
FROM [Stack]
INNER JOIN [dbo].[Objects] AS [Obj] ON [Stack].[object_id] = [Obj].[parent_object_id]
)
SELECT
[Det].[id] AS [object_id],
[Det].[level],
[Det].[object_type],
[Det].[tag],
[Det].[description]
FROM [Stack] AS [S]
INNER JOIN [dbo].[Objects] AS [Det] ON [S].[object_id] = [Det].[id]
);
GO
Wait! How do you… oh yeah. Unit Tests.
EXECUTE [tSQLt].[Run] @TestName = N'GetChildObjects';
GO

Perfect, now we can check out the timings!
SELECT * FROM [dbo].[GetChildObjects](11466) OPTION (RECOMPILE, MAXRECURSION 0); -- 0 seconds.
SELECT * FROM [dbo].[GetChildObjects](11000) OPTION (RECOMPILE, MAXRECURSION 0); -- 1 second.
SELECT * FROM [dbo].[GetChildObjects](10000) OPTION (RECOMPILE, MAXRECURSION 0); -- 3 seconds.
SELECT * FROM [dbo].[GetChildObjects](9000) OPTION (RECOMPILE, MAXRECURSION 0); -- 6 seconds.
SELECT * FROM [dbo].[GetChildObjects](8000) OPTION (RECOMPILE, MAXRECURSION 0); -- 8 seconds.
SELECT * FROM [dbo].[GetChildObjects](7000) OPTION (RECOMPILE, MAXRECURSION 0); -- 13 seconds.
SELECT * FROM [dbo].[GetChildObjects](6000) OPTION (RECOMPILE, MAXRECURSION 0); -- 15 seconds.
SELECT * FROM [dbo].[GetChildObjects](5000) OPTION (RECOMPILE, MAXRECURSION 0); -- 16 seconds.
SELECT * FROM [dbo].[GetChildObjects](4000) OPTION (RECOMPILE, MAXRECURSION 0); -- 18 seconds.
SELECT * FROM [dbo].[GetChildObjects](3000) OPTION (RECOMPILE, MAXRECURSION 0); -- 21 seconds.
SELECT * FROM [dbo].[GetChildObjects](2000) OPTION (RECOMPILE, MAXRECURSION 0); -- 23 seconds.
SELECT * FROM [dbo].[GetChildObjects](1000) OPTION (RECOMPILE, MAXRECURSION 0); -- 26 seconds.
SELECT * FROM [dbo].[GetChildObjects](10) OPTION (RECOMPILE, MAXRECURSION 0); -- 31 seconds.
Taking from 0 seconds for 1 row to 31 seconds for 11,456 rows, we now our new timings. And while I’m not a fan of the MAXRECURSION 0
I don’t have a major problem with adding it to an existing RECOMPILE
statement.
A great side effect is that we have more information in the execution plan!

There we have it!
Thanks to our unit tests we have a more performant code to send back with confidence that it works the same as the old code…just faster.
Now if you’ll excuse me, my 30 minutes are up and my to-do list is growing.