Welcome to T-SQL Tuesday, the monthly blogging party where we talk about a topic given to us by the host. This month, we have Steve Jones (blog | twitter) asking us about Dynamic SQL.
There are a myriad number of uses for Dynamic SQL – I’ve already read some of the published posts and I’m impressed with the range. (Yeah, I’m writing this late; I’m aware).
I’m aiming for something different. I want to talk about the things I believe Dynamic SQL should have. You can disagree with me if you’d like; I’d welcome it (as long as you can justify it) but here are my thoughts on writing Dynamic SQL.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There should be an addendum on that quote for DBAs:
DBAs troubleshoot Dymanic SQL more often than they write it
Probably others, but definitely me
Saying that, Dynamic SQL should have these items to help with that effort.
Proper Formatting
It’s extremely easy to write Dynamic SQL so that it comes out in one string. A hodge-podge of plus signs, variable assignments, and red text that sometimes it seems like a foreign coding language
The above code sample is apparently a working “Hello World” program in one of those languages.
Don’t do this, properly format your Dynamic SQL. It will help when the code shows up in your monitoring toolkits. You have them (and know how to use them), right?
To help check your formatting, Dynamic SQL should include…
A Debug Method
Troubleshooting is so much easier when you know what you are going to run. The amount of Dynamic SQL where you have to build that in your head while reading the code is ridiculous!
If you give a procedure filled with Dynamic SQL to a junior DBA, then you’re going to see how fast fear and resignation fills someones face.
It’s straightforward to create a debug method that shoots out the code that is going to be run. Secondary benefit is it ensures that you format your code properly because you can see how it is going to turn out
IF @debug_mode_on = 1
BEGIN
RAISERROR(N'%s%s', 0, 1, @job_sql, @nl) WITH NOWAIT;
END;
ELSE
BEGIN
EXECUTE [master].[sys].sp_executesql
@stmt = @job_sql,
@param1 = N'@ds_job_name AS nvarchar(128)',
@ds_job_name = @actual_job_name;
END;
A.O.B
There are a few other things that I like to add to Dynamic SQL but I will grant are not necessary. I’ll leave them to you to make up your own minds about.
Sanitise inputs
If the user passes in an object, ensure it’s there
/* Job existence check */
IF @job_name IS NOT NULL
BEGIN
SET @actual_job_name = (
SELECT
[name]
FROM msdb.dbo.sysjobs
WHERE
[name] = @job_name
);
IF @actual_job_name IS NULL
BEGIN
DECLARE @err_msg AS nvarchar(max);
SET @err_msg = FORMATMESSAGE(N'Cannot find any job labelled: %s', @job_name);
RAISERROR(N'%s', 0, 1, @err_msg) WITH NOWAIT;
RETURN
END;
END;
Escaping input
Same point really – if you want to raise a warning or error with what is passed in, use something that escapes user input.
These might appear to be overkill but I have an cough contrived cough example; the code I took from Andy’s github and converted to Dynamic SQL!
Quick Test with Debug
If we pass in no job name and leave debug mode on; it splits out the code sans any WHERE clause before the ORDER BY to bring us back all jobs:
If we change up the job name to one that does not exist e.g. @job_name = N'No existy'; then we verify that the job doesn’t exist and error out:
Passing in a job that does exist, then adds that job name to the WHERE clause (parameterised, of course):
Let’s Run It!
Turn off debug mode and…
That’s throwing up an error message – but I’d be confident in saying that anyone glancing over the code in the gist would miss the three mistakes.
Yes, three of them – I wrote one on purpose but created two more by accident.
Luckily we can turn debug more back on, grab the output, and parse it in a new window:
The AND on the JOIN clause was intentional – the terminating semi-colon before the WHERE clause was not, neither was the mistake on the alias. The joys of re-writing code.
Fix these mistakes up in our Dynamic SQL, turn debug mode back off, and ready to re-run? Probably took around 5 minutes going slow, but now when we re-run the code:
Result Set
That’s one example of why I have believe Dynamic SQL should have proper formatting, and debugging, and some small others.
I had more examples but they normally come to me around 02:00 in the morning while I’m on swearing and sweating on an incident call.
I’d prefer to fix and forget them, after I make sure they don’t happen again.
This is yet another time that a blog post has come about from a question by a developer. They’re good guys, I guess, they keep me on my toes.
This time it was with change logging. We didn’t have Change Data Capture (CDC), or Temporal Tables enabled (have you seen the YouTube videos by Bert Wagner ( blog | twitter ) on these?). What we did have was “manual logging” and no, I’m not even talking about Triggers.
What we had was INSERT statements, directly after a MERGE statement, that inserted into a table variable a hard-coded name of the column, the old value, and the new value.
Is that what I would do? Doesn’t matter, it was there before I got there, seems to work, and is low down on the list of priorities to change.
The question was, every time that they needed to add a column to a table, and change log it, they had to add multiple lines to the change tracking procedure and the procedure was getting gross and hard to maintain.
You know the drill by now, I quite like to play along so let us facilitate that (from now on I’m going to use Gist, formatting with native WordPress is starting to annoy me).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This will create our table and, luckily, all of it’s columns are important enough to warrant capturing when they get changed.
Despite their looks, these values are “important”
Old, Way WHERE old=way
Let’s take a look at the code that they were using, shall we?
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Hey! It’s legacy code, let’s focus on just 1 problem at at time!
The main issue that I was asked about was every time a column was deemed important and needed to be added to the list, they had to insert another INSERT INTO @ChangeLogTemp... and they thought that it wasn’t sustainable in the long run.
Hmmm it also comes across as very RBAR doesn’t it? Every time we want to include another column to the change tracking, we have to add them row by agonizing row. The script is already big enough, if we keep adding more, it will get massive!
Set based is 90% of the time the right way to go but how do we do set based solutions on the same table?
New JOIN Way ON new = way
The first thing I do is to change that table variable into a temp table. Stats, indexes (if necessary), and I can query the results as we go along. Much better!
Temp > Variable?
The second thing is that, whether by luck or by design, the legacy code has the same naming conventions for the columns; new column values are have the prefix “New%” in the column name and old columns have the “Old%” prefix.
This works for us because we can now split the new columns into 2 derived tables, New and Old, and that way we have the differences.
Potential problem here…
Have you ever tried to find the differences between two consecutive rows of data? It’s fiendishly difficult. WHERE Column1 on row1 != Column1 on row2 apparently just does not work, le sigh.
I’ve talked before about PIVOT but now I’m going to introduce you to it’s little brother, UNPIVOT, which “rotating columns of a table-valued expression into column values”
I say “little brother” because the whole document talks about PIVOT, with only brief mentions of UNPIVOT in the notes.
If you’re writing documentation like this, please stop.
With UNPIVOT we can create a table of our rows around our ID and Column names…
Potential problem averted!
… and with this, we can join on our ID and Column names and get to our more intuitive WHERE OldValue != NewValue.
Bringing it all together!
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
The whole thing was supposed to be to reduce the amount of changes required when they need to include or exclude columns. All in all though, it’s just 6 lines less. Not exactly the great return that you’d expect.
Yeah, true with the old way for every column we want to add we have to add an extra 6 lines while the new way adds 2.
That means for 1,024 columns:
The old way could have at least 6,144 lines per table. (1024 * 6)
The new way could have at least 2,048 lines per table (not explaining this calculation >:( )
So, is there anything else that we can do?
Dynamically?
I’ve talked before about T-SQL automation with Dynamic SQL and this should be a good candidate for that.
What can we make dynamic here though? How about…
The new and old columns bit?
The FOR ColumnName IN([Column1], [Column2], [Column3], [Column4], [Column5], [Column6]) bit?
The CAST(ISNULL([Old/NewColumn], '') AS nvarchar bit?
This is basically the same as the above. Don’t be put off by needing to add CAST(ISNULL( before the column names, it’s not as complex as you’d think.
STUFF just doesn’t look as pretty… 🙁
Now that we have our dynamic bits, let’s create the full statements.
Full Dynamic Script
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
', CAST(ISNULL('+QUOTENAME(RIGHT(tc.name, LEN(tc.name))) +', '''') AS nvarchar(20)) AS '
+QUOTENAME(RIGHT(tc.name, LEN(tc.name) –3))
FROMtempdb.sys.tables AS tt
INNER JOINtempdb.sys.columns AS tc
ONtt.object_id=tc.object_id
WHEREtt.nameLIKE'#OutputTableDynamic%'
ANDtc.nameLIKE'New%'
ORDER BYtc.column_id
FOR XMLPATH('')
), 1, 2, ''));
SET @OldColumns =
(
SELECTSTUFF(
(
SELECTTOP (1024)
', CAST(ISNULL('+QUOTENAME(RIGHT(tc.name, LEN(tc.name))) +', '''') AS nvarchar(20)) AS '
+QUOTENAME(RIGHT(tc.name, LEN(tc.name) –3))
FROMtempdb.sys.tables AS tt
INNER JOINtempdb.sys.columns AS tc
ONtt.object_id=tc.object_id
WHEREtt.nameLIKE'#OutputTableDynamic%'
ANDtc.nameLIKE'Old%'
ORDER BYtc.column_id
FOR XMLPATH('')
), 1, 2, '' ) );
SET @columns =
(SELECTSTUFF(
( SELECTTOP (1024)
', '+QUOTENAME(RIGHT(tc.name, LEN(tc.name) –3))
FROMtempdb.sys.tables AS tt
INNER JOINtempdb.sys.columns AS tc
ONtt.object_id=tc.object_id
WHEREtt.nameLIKE'#OutputTableDynamic%'
ANDtc.nameLIKE'New%'
ORDER BYtc.column_id
FOR XMLPATH('')
), 1, 2, ''));
SET @DynamicSql =
N'SELECT New.ColumnName AS CN, New.NewValue AS NV, Old.OldValue AS OV FROM (SELECT Unpvt.GotAnyChangeId, Unpvt.ColumnName, Unpvt.NewValue FROM (SELECT [GotAnyChangeId], '+ @NewColumns
+N' FROM #OutputTableDynamic ) AS DataSource UNPIVOT (NewValue FOR ColumnName IN ('+ @columns
+N') ) AS Unpvt ) AS New INNER JOIN ( SELECT Unpvt.GotAnyChangeId, Unpvt.ColumnName, Unpvt.OldValue FROM (SELECT [GotAnyChangeId], '+ @OldColumns
+N' FROM #OutputTableDynamic ) AS DataSource UNPIVOT (OldValue FOR ColumnName IN ('+ @columns
+N')) AS Unpvt) AS Old ON Old.ColumnName = New.ColumnName AND Old.GotAnyChangeId = New.GotAnyChangeId WHERE New.NewValue != Old.OldValue FOR XML PATH(''Change''), ROOT(''Changes'')';
Overall, the script is longer at nearly double the lines but where it shines is when adding new columns.
To include new columns, just add them to the table; to exclude them, just add in a filter clause.
So, potentially, if every column in this table is to be tracked and we add columns all the way up to 1,024 columns, this code will not increase. Old way: at least 6,144. New way: at least 2,048. Dynamic: no change
Summary:
Like the script, this was a massive post. Back at the start, I said that a developer came to me because they wanted to get more DRY (?) and stop needing to add more content to the stored procedure.
Do you think the developer used this?
Nope!
I can’t say that I blame them, it’s slightly ugly and unwieldy, and I wrote it so I should love it.
Yet if something was to go wrong and the need was there to open the procedure and troubleshoot it, the first person to open this up is going to let out a groan of despair!
So this request turned into a proof of concept and nothing more. No skin off my back, I have a growing list of tasks to accomplish by 5 minutes ago. Better get back to them.