T-SQL Tuesday 183 – Tracking Permissions

Words: 593

Time to read: ~3 minutes

PowerShell Scripts for Checking SQL Server Login SIDs

A picture of the T-SQL Tuesday logo; a blue database with a github style commit banner in a different blue wrapped around it.

Welcome back to T-SQL Tuesday, the monthly blogging party where we are given a topic and asked to write a blog post about it.

This month, we have Steve Jones [ blog ] asking us to talk about permissions.

Before


Normally, I would have skipped this month because I’ve talked about permissions before, and I feel like it’s one of the better posts I’ve written.
But I’m trying to write more, so that can’t be an excuse anymore.

So, let’s talk about AGs, DistAGs, and SQL Server authentication login SIDs.

AGs and DistAGs


In work, we have Availability Groups (AGs) and Distributed Availability Groups (DistAGs), depending on the environment level, e.g. Prod, Pre-Prod, QA, Dev, etc.

When we plan failovers for maintenance, we run through a checklist to ensure we can safely complete the failover.

One of these checks is to ensure that the correct logins with the correct SIDs are on each replica. Otherwise, we could get flooded with a lot of “HEY! My application doesn’t connect anymore! Why you break my application“, when apps try and log into the new primary and the user SIDs don’t match the login SIDs.

While I don’t have a problem with doing a query across registered servers, I threw together a quick script in PowerShell with dbatools that does the job of checking for me. And, like most things that happen in business, this temporary solution became part of our playbook.

Who knows! Maybe this quick, throwaway script could also become part of other people’s playbooks!

I’m not sure how I feel about thatโ€ฆ

Scripts

We’re using dbatools for this because I think it’s the best tool for interacting with databases from a PowerShell session regardless of what Carbon Black complains about.

Getting the AG Replicas

Getting the AGs for each server is relatively simple, using a combination of Get-DbaRegServer and Get-DbaAvailabilityGroup.

Then, we can use the ever-respected Get-DbaLogin to get a list of logins and their SIDs per replica.
If we have a mismatch, we will have an issue after we failover. So best nip that in the bud now (also, I had this phrase SO wrong before I looked it up!).

$InstanceName = 'test' 
# $InstanceName = $null # Uncomment this line to get ALL
$Servers = Get-DbaRegServer

if ($InstanceName) {
	# Find the server that ends in our instance name
	$Servers = $Servers | Where-Object ServerName -match ('{0}$' -f $InstanceName)
}

# Get all replicas per Instance name that are part of AGs...
$DAGs = Get-DbaAvailabilityGroup -SqlInstance $Servers.ServerName |
	Where-Object AvailabilityGroup -notlike 'DAG*' | # Exclude DAGs which just show our underlying  AG names, per our naming conventions.
	ForEach-Object -Process {
		$instance = $null
		$instance = $_.InstanceName

		foreach ($r in $_.AvailabilityReplicas) { # I'd like 1 replica per line, please.
			[PSCustomObject] @{
				InstanceName = $instance
				Replica      = $r.Name
			}
		}
	} |
	Select-Object -Property InstanceName, Replica -Unique |
	Group-Object -Property InstanceName -AsHashTable

# Get a list of Logins/SIDs that don't match the count of replicas, i.e. someone forgot to create with SIDs...
foreach ($d in $DAGs.Keys) {
	Write-Verbose "Working on instance: $d" -Verbose
	Get-DbaLogin -SqlInstance $DAGs[$d].Replica |
		Group-Object -Property Name, Sid |
		Where-Object Count -ne $DAGs[$d].Count |
		Select-Object -Property @{ Name = 'Instance'; Expression = { $_.Group[0].InstanceName }},
			@{ Name = 'Occurances'; Expression = { $_.Count }},
			@{ Name = 'Login/SID'; Expression = { $_.Name }}
}
I got 99 problems, and Login/SIDs are one! Or I have 5 Login/SID problems.

Cure


They say “prevention is better than cure”, and I’d love to get to a stage where we can “shift left” on these issues. Where we can catch them before someone creates a login with the same name but different SIDs on a replica.

But we’re not there yet. At least, we can find the issues just before they impact us.

That’s something, at least!

T-SQL Tuesday #182 – Integrity

Time to read: ~ 3 minutes
Words: 531 words

Welcome to T-SQL Tuesday, the monthly blogging party where we are given a topic and have to talk about it. Today, we have Rob Farley ( blog | bluesky ), talking about integrity.

I’ll admit that it’s been a while since I’ve written a blog post. It’s been a combination of either burnout or busyness, but let’s see if I still have the old chops. Plus, I’m currently sick and resting in bed, so I have nothing better to do.

I’m one of the few who haven’t had experiences with corruption in our database. Apart from Steve Stedman’s ( blog ) Database Corruption Challenge, that is.

With that being said, what do I have to say about this topic then? Well, let’s talk about the first version of corruption checking automation that we introduced in work.

Now, this is just the bare bones and many different iterations since then, but the essence is here.

Overview

Like many shops out there, we can’t run corruption checking on our main production database instance. So, then, what do we do? We take the backups and restore them to a test instances, and then run corruption checking on those restored databases.

At least this way we can test the backups we take can be restored, as well.

But, I don’t want to spend every day manually restoring and corruption checking these databases, so let’s automate this bitโ€ฆ

Limitations

A quick scour of the interwebs brought back something extremely close to what I want by Madeira Data Solutions. It had some extras that I didn’t want, though.

More importantly, it used some functions that our dreaded antivirus software still screams false positives about. So, they would stop our script from running if we even tried.

So, a script re-write was required.

Script

Here’s the link to the script in GitHub if you want to check it out.
Use at your own peril. Also, default gist embedding on this platform is visible off-putting.

General Rundown

Here’s a general rundown of the parts of the code, just to make it easier to grok. It’s not to bump the word count on this post.

Cave Canum

This is an old version of the script. The first version, if my flu-riddled brain can remember correctly, so there are small bugs here.

Off the top of my head, I think it still:

  • only works if the backup file is called FULL
  • only works if there is a single FULL backup file per database ( _that got fixed real fast!_ )
  • A.N.Other

Grab the Full Backup

Does it expect FULL in the file name? Yes. Should it? Arguably not.

Restore Everything on the Path

We did a database per database basis. “RBAR” is fine outside a DB, right?

Faire le Corruption Check

Sacre Bleu! No corruption for you!

Report on the results

Because if you don’t tell people, have you actually done it?

Ar aon nรณs

Like I said, that script is the first iteration.

Use at your own peril.

Here be dragons and krakens and DBCC WRITEPAGE.

My main point is have the INTEGRITY to say “Yes, I have regular corruption checking of my databases”.

See what I did there?

T-SQL Tuesday #164: Code That Made You Feel A Way

Time to read: ~ 4 minutes

Words: 899

Welcome to T-SQL Tuesday, the monthly blogging party where we receive a topic to post.

This month we have Erik Darling ( blog ) asking us to post about “code that makes [us] feel a way“.

A while back, I was tasked to performance tune some code that brought me through all five stages of grief. It’s best to take you through the events from the developer’s viewpoint. There’s less cursing that way.

Denial

Hey, what’s up? You were asked to troubleshoot some code. Not a big deal; these requests come in from time to time.ย 

You have code that keeps coming up as a high consumer of CPU on our systems? OK?

It had gotten to the stage where it kept appearing in sp_WhoIsActive? What’s that? And should it not be “whom is active”? Well, agree to disagree.

Let’s see the code so. Wow, that’s a small scroll bar! Yeah, that’s one of ours.

No, it is. I swear it is.

I’m not surprised you can’t find it in stored procedures; we dynamically create it. Here’s the procedure. Yeah, all 900 lines! Pretty cool, huh?

What do you mean, why? We had to! We had to react to the different parameters that get passed in.

Anger

Alright, alright! Calm down. I misspoke.

Yeah, that’s it, breathe. There aren’t different parameters. It’s a parameter. Singular.

No, we still need to generate the query dynamically. Cause the XML could be different. Huh? Yeah, the parameter is XML; it could be anything in there. So we pass the XML in as a nvarchar(max) string.

You spilt some of your coffee there when you banged the desk.

Then we use sp_xml_preparedocument at the start, but then we have to use sp_xml_removedocument at the end.

You’ve never heard of those before? I thought you were a DBA?

We use the “prepare” to get the XML into a state that’s easier to consume, and we use the “remove” so we don’t get memory leaks!

Your face is getting a bit red, did you know that?

It’s SQL Server; it uses enough memory it can share some!

Did we read the docs? No, why? It can use one-eighth of the total memory available for SQL Server. Ah, but that’s “can”, not “will”.

Yes, yes, cursors upon cursors upon cursors. Why? We don’t know how many values are in each XML node in the params, so we have to split them out.

We then join them into a comma-delimited string, which is then used in IN clauses… woah! Is that the sound of your teeth grinding? You know that’s not good for your health. What do you mean neither am I?

Anyway, then we parse all of that down to a massive, what do you call it, swiss-army knife, lego-block, dynamic query built based on what’s passed in. You don’t call it that? I thought you did. What do you call it so? Wow, that’s not a word I had heard of before.

It’s not too bad, though! We pass in everything as literal values, so it’s faster that way. We read up on that parameter sniffing issue you said can happen. That isn’t going to catch us out here!

Modern problems require modern solutions, as they say.ย What’s that? It’s not a modern problem? I’m a what? Are you allowed to call me that?

Bargaining

You want me to re-write it? You’ll help? It’ll take a lot of time. It’s not really a priority.

Plus, it’s not even worth it. How much CPU? Is that a “illion” with an “m” or “illion” with a “b”? Per run? Is that a lot? Well, I really wouldn’t know, now would I?ย 

Yeah, good point; you’ll also have to check how much memory it uses. Hey! Probably one-eight of the total memory available for SQL Server. Alright, it wasn’t that bad a joke; everyone’s a critic!

You’ll give me credit? But it’s fine in Dev; there’s no point. I suppose we could remove some of the WHERE OR IS NULL branches since we know if they’re empty when we create the query.

Nah, that would make the procedure a bit too big. Anything over 1000 lines is too much. Well, 985 is still less than 1000!

Depression

Why are you crying? No, we care, we do, we really do. No, not all the code is like this; this is old. Yes, we stopped using that XML parameter pattern. Yes, and the cursors. Well, most of them, at least.ย 

Your tears are mixing with the coffee you spilt, you know that?ย 

Here, there’s really no need for the fetal position in the middle of the office. You have yet to see the code for the busier databases.

You know what, why don’t you go home and come back in tomorrow, and I’ll go over that way, OK?

Acceptance

Hey! There’s our favourite DBA. You seem a lot cheerier.

Oh, you don’t have to perf-tune that query anymore? That’s great; see, it wasn’t so bad. What’s that? The latest release failed? It ran fine in Dev.

You’re permitted to set up a QA environment to test performance before code gets released? What kind of code? Any stored procedure?

Ah, that’s OK. No, really, it’s fine. We started using ORMs and embedding SQL into the app a while ago. This won’t affect us.

You’ve started crying again.

T-SQL Tuesday 161: Having Fun with SQL

Words: 1,283

Time to read: ~ 7 minutes

Welcome to T-SQL Tuesday, the monthly blogging party where a host provides a topic, and we blog about it.

This month we have Reitse Eskens ( blog | twitter | mastadon ) asking us to talk about something fun we have done with T-SQL.

It has been quite a while since I have coded just for fun, so I’m thankful to Reitse for suggesting this. Unfortunately, I don’t have a pre-baked idea for this T-SQL Tuesday, so let’s see what we can come up with.

Echos

Around December 2021, Wordle hit the virtual scenes. Yeah, nearly two years ago. How do you feel about that?

I got swept up in that wave for a while in the same way I got swept up in the other trends of my time, like Pokemon, Sodoku, and Angry Birds.

Eventually, I stopped when I found a PowerShell script by Kieran Walsh ( github | twitter ) where you could narrow down to the correct answer by putting in the results of your guess each round.

This hack led to me realising how much time I was spending on Wordle and that I should stop, much like I did with Pokemon, Sodoku, and Angry Birds.

So, what better thing to do than to try and recreate that PowerShell script in T-SQL

Rules

I must recreate as much of the script as possible in T-SQL in only one hour.
Yes, I’m aware that’s more of a rule than rules but Wordle needs five letters dammit, and “rule” was crying out for that prosthetic “s”!

Total (code)

Don’t worry, you just have to fill in the variables on lines 19-26.

Split

A few things need to be taken care of out of the bat.

The potential answers have to be stored somewhere in the database. Thankfully, I had the answers in a text file, so creating a table and then inserting them was easy.

I could do the insert with flat files, but I already have PowerShell open so…

$WordleAnswers = Get-Content -Path $HOME\Documents\wordle-answers-alphabetical.txt |
    ForEach-Object -Process {
        [PSCustomObject] @{
            WordleAnswer = $_
        }
    }

$WriteTableParms = @{
    SqlInstance = 'localhost\SQL2019'
    Database = 'Wordle'
    Schema = 'dbo'
    Table = 'WordleAnswers'
    ColumnMap = @{
        "WordleAnswer" = "wordle_answers"
    }
}
$WordleAnswers |
    Write-DbaDataTable @WriteTableParams


Next, we need the variables that we can create. If I can finish this before the 1-hour mark, I’ll turn this into a stored procedure with parameters and everything! Until then, it’s script and variable times.

DECLARE
	@known_letters AS varchar(5),
	@excluded_letters AS varchar(26),
	@position1 AS char(1),
	@position2 AS char(1),
	@position3 AS char(1),
	@position4 AS char(1),
	@position5 AS char(1),
	@correct_letters AS xml,
	@all_answers_sql AS nvarchar(MAX);

/* region Enter Variables here */
SET @known_letters = '';
SET @excluded_letters = '%[]%';

SET @position1 = NULL;
SET @position2 = NULL;
SET @position3 = NULL;
SET @position4 = NULL;
SET @position5 = NULL;
/* endregion Enter Variables here */

The PowerShell code has known_letters, excluded_letters, positions, and wrong_positions.

I can do all these easily enough, except for wrong_positions. I can’t think of a way to do hashtables in SQL that doesn’t equal a secondary table or user-table type, etc. I’ll leave that to the end if I have time.

known_letters is an array of strings. I haven’t updated the SQL Server version on my laptop in a while, so there is no string_split for me. Let’s do the XML way so.

/* region KnownLetters */
SELECT @correct_letters = CONCAT(
	'<known_letters>',
	REPLACE(@known_letters, ',', '</known_letters><known_letters>'),
	'</known_letters>'
);

SELECT
	[known] = [l].[y].value('.', 'char(1)')
INTO #KnownLetters
FROM
(
	VALUES
	(@correct_letters)
) AS [x] ([kl])
CROSS APPLY [kl].nodes('/known_letters') AS [l] (y);
/* endregion KnownLetters */

excluded_letters I can get away with by using some LIKE jiggery-pokery, where it will search for any characters between the square brackets.

positions I can split out into individual variables. I can more easily deal with them then, and it only ends up as an extra five variables this way.

Creating the table would have been handier if I had made a column for each character, but I didn’t, so it’s some SUBSTRING logic for me to get the individual characters out.

SELECT
	[wa].[wordle_answers],
	[g].[char1],
	[g].[char2],
	[g].[char3],
	[g].[char4],
	[g].[char5]
FROM [dbo].[WordleAnswers] AS [wa]
CROSS APPLY (
	VALUES (
		(SUBSTRING([wa].[wordle_answers], 1, 1)),
		(SUBSTRING([wa].[wordle_answers], 2, 1)),
		(SUBSTRING([wa].[wordle_answers], 3, 1)),
		(SUBSTRING([wa].[wordle_answers], 4, 1)),
		(SUBSTRING([wa].[wordle_answers], 5, 1))
	)
) AS [g] ([char1], [char2], [char3], [char4], [char5])

If we do know the positions of some of the letters, then I can strip out a lot of the potential answers straight away. I’m not a fan of Swiss-army-knife WHERE clauses, so I’ll do the dynamic SQL.

I’m also not a fan of WHERE 1=1 in my dynamic code, but I’m running low on time here, and it’s faster to add that in first and start everything else with an AND than it is to check if this is the first clause in the WHERE section or not.

Plus, I’m less against WHERE 1=1 than I am against Swiss-army-knife WHERE clauses.

/* region Known Positions */
CREATE TABLE #AllAnswers
(
	[wordle_answers] char(5),
	[char1] char(1),
	[char2] char(1),
	[char3] char(1),
	[char4] char(1),
	[char5] char(1)
);
SET @all_answers_sql = N'SELECT
	[wa].[wordle_answers],
	[g].[char1],
	[g].[char2],
	[g].[char3],
	[g].[char4],
	[g].[char5]
FROM [dbo].[WordleAnswers] AS [wa]
CROSS APPLY (
	VALUES (
		(SUBSTRING([wa].[wordle_answers], 1, 1)),
		(SUBSTRING([wa].[wordle_answers], 2, 1)),
		(SUBSTRING([wa].[wordle_answers], 3, 1)),
		(SUBSTRING([wa].[wordle_answers], 4, 1)),
		(SUBSTRING([wa].[wordle_answers], 5, 1))
	)
) AS [g] ([char1], [char2], [char3], [char4], [char5])
WHERE 1=1';

IF @position1 IS NOT NULL SET @all_answers_sql = CONCAT(
	@all_answers_sql,
	N'
	AND [g].[char1] = ',
	QUOTENAME(@position1, '''')
);

IF @position2 IS NOT NULL SET @all_answers_sql = CONCAT(
	@all_answers_sql,
	N'
	AND [g].[char2] = ',
	QUOTENAME(@position2, '''')
);

IF @position3 IS NOT NULL SET @all_answers_sql = CONCAT(
	@all_answers_sql,
	N'
	AND [g].[char3] = ',
	QUOTENAME(@position3, '''')
);

IF @position4 IS NOT NULL SET @all_answers_sql = CONCAT(
	@all_answers_sql,
	N'
	AND [g].[char4] = ',
	QUOTENAME(@position4, '''')
);

IF @position5 IS NOT NULL SET @all_answers_sql = CONCAT(
	@all_answers_sql,
	N'
	AND [g].[char5] = ',
	QUOTENAME(@position5, '''')
);

SET @all_answers_sql = CONCAT(@all_answers_sql, N';')

PRINT @all_answers_sql;

INSERT INTO #AllAnswers
EXECUTE [sys].[sp_executesql] @stmt = @all_answers_sql;
/* endregion Known Positions */

Finally, we can UNPIVOT the individual characters for the words and join them with the known_letters to single down to those answers.
As well as excluding characters that we know aren’t in the word.

Or else just return everything we have, minus excluded characters.

IF LEN(@known_letters) > 0 BEGIN
	SELECT
		*
	FROM #AllAnswers AS [w]
	UNPIVOT 
	(
		[chars] FOR [chr2] IN ([w].[char1], [w].[char2], [w].[char3], [w].[char4], [w].[char5])
	) AS [unpvt]
	JOIN #KnownLetters AS [kl]
		ON [unpvt].[chars] = [kl].[known]
	WHERE
		[unpvt].[wordle_answers] NOT LIKE @excluded_letters
END
ELSE
BEGIN
	SELECT
		*
	FROM #AllAnswers AS [a]
	WHERE [a].[wordle_answers] NOT LIKE @excluded_letters;
END;

Guilt

In the PowerShell script, you can add characters in the excluded_letters parameter that exist in the known_letters parameter, and it will correctly ignore them.

Alas, Tempus fugit and I didn’t get to do the same for this T-SQL version. Maybe that has something to do with translating “time flies” into Latin and then looking up other sayings in Latin, but we can’t say for sure. Mea culpa!

However, it’s been around 50 minutes with minor troubleshooting here and there, so time to test this bad boy.

Tests

Let’s hop over to https://wordleplay.com/ and test it out.

I’ll take the first answer returned each time unless it is the answer we chose previously.

PowerShell

I’m not going to use the wrong_positions parameter here since I didn’t re-create that in T-SQL. Still, I got lucky and got the correct answer on the third guess

T-SQL

The T-SQL method doesn’t show each iteration as well as the PowerShell does. And, there’s more human brain power required to make sure you don’t enter the same letter in the known_letters and the excluded_letters variables.

Overall though, well done with a respectable four guesses

Point

I’m not going to say that there is no point to these exercises.

Fun is a valid a point as any other. In a work world filled with more demands on our time than the number of Pokemon (there’s still only 150, right?), more technologies to learn than combinations in Sodoku puzzles, and more people demanding the seemingly impossible out of you so that you want to yeet them at solid objects … something something Angry Birds, it’s a welcome change to do something just for fun once in a while.

Thanks Reitse

Querying SQL Server with Golang

Words: 333

Time to read: ~ 2 minutes

I’ve been looking into Golang recently. Hey, everyone has to go with what interests them!

It’s all very well and good to open up Golang and write a FizzBuzz (note to self: write a FizzBuzz), but I still work with databases.

So before I do anything with Golang, I’d like to know: can it interact with databases. 

Granted, everything can, but how easy is it? And does trying to interact with databases kill any interest I have in the language.

So, let’s give it a go.

First, some caveats.

  1. You have to install Go (shocking, I’m aware).
  2. This post is effectively me reading the tutorials on the Golang website and converting it to work with SQL Server.
  3. You have to do some funky stuff to get the driver for SQL Server; granted, funky things basically mean run go get <Github link>. I’ve done funkier…
  4. There’s also some better way to do this, but who cares. This is exploratory.
  5. When setting up my SQL instances, I did some weird thing where I now have to specify a non-standard port when connecting. So I have to open up the error log in SQL Server, see what port it’s listening to, and put that into the code. Yes, I’m aware that I’m a DBA, and I should be able to fix this, but it’s my personal time & my personal test instance. I’ll fix it when I want to.

Anyway, here’s the code. 

Do with it what you will; I’m just happy it works, and I’m delighted that I’m excited about it!

package main
import (
"database/sql"
"fmt"
"log"
// this is the driver & yes, even though we're throwing
// it away to `_`, it's still needed
_ "github.com/denisenkom/go-mssqldb"
)
var (
db *sql.DB
server = "localhost"
// Yes, I hate this too…
port = 60515
)
func main() {
// Build connection string
connString := fmt.Sprintf(
"server=%s;port=%d;Trusted_Connection=True;AppName='Golang'",
server,
port,
)
// I want the line that the error occurs in my logs, please & thank you
log.SetFlags(log.Lshortfile)
// Create connection pool
var err error
db, err = sql.Open("sqlserver", connString)
if err != nil {
log.Fatalln("Creating connection pool failed:", err.Error())
}
stringDate, err := getDate()
if err != nil {
log.Fatalln("getDate() failed:", err.Error())
}
fmt.Println("Read:", stringDate)
fmt.Println("Read rows successfully!")
fmt.Println("Finished.")
}
func getDate() (string, error) {
var dte string
// We're "scanning" the values from the query into the `dte`
// variable using `&<var name>`. Different, but interesting…
if err := db.QueryRow("SELECT dt = SYSDATETIME();").Scan(&dte); err != nil {
return "false", fmt.Errorf("getDate() function failed! %s", err.Error())
}
return dte, nil
}
view raw db.go hosted with ❤ by GitHub

First of all, what happens when it works?

go run main.go

And what happens when it fails, say cause the default port actively refuses connections?!

// changing `port = 60515` -> `port = 60514`

Happy out! Now to see what else we can do. Thankfully, I’ve somehow managed to get some ideas marinating in creative juices. Let’s see if I have the time and the energy to see them through!

T-SQL Tuesday #152

Words: 375

Time to read: ~ 3 minutes

I don’t know if this post is a rage against “database-ubiquitous” SQL or Object-Relational Mapping tools (ORMs)

I’m going to say that it’s a rant against the first because if you know what you are doing, then ORMs won’t be a performance concern

But people use ORMs instead of Stored Procedures because
Them: “what happens if we need to change databases? We don’t have to worry about that now cause they’ll be baked in the application, and we won’t have to recreate the Stored Procedures in the new database cause ORMs just work!”

Everything just works until it doesn’t

I’ve worked with several companies and heard this argument many times from developers

Part of why I’m ranting about database-ubiquitousness SQL instead of ORMs is that I’ve also witnessed SQL code that has to work everywhere

I’m going to call this SQL “DUh SQL” from now on since I only have so many vowels on my keyboard and shouldn’t waste them re-writing the word “ubiquitousness”

Them: “Nope, we can’t use DATEFROMPARTS; it’s not available in other databases! Yeah, we use GETDATE() everywhere; what’s your point? WHAT!?!”

I’ve moved across different cloud providers more than I’ve moved databases, and guess what? Each cloud provider move, we’ve kept the same database

I know that knowledge sharing “in the wild” says that you should use ORMs or DUh SQL
The route to a 10x engineer is paved in DRY, KISS, and SWALK
Well, maybe not SWALK

Them: “But the leading minds out there say that you should keep it DUh since that means it’ll be a breeze if we have to move databases”

Sure, and I’ve been told that ice baths help in recovery
I’m not going to take them, though, since I know there’s a difference between me trying to do a 5k and dedicated athletes!

I love arguing with people in my blog posts; I rarely lose

So, don’t use DUh SQL cause of reasons that may not apply to you
Don’t refuse performance tuning efforts if they’ll add database-specific code
And, please, learn to use your tools

Oh, and apologies for the lack of full-stops, I can’t use them in case I have to turn this post into regex

Table Column Differences Part 03 – Compare-SqlTableColumns

Words: 470

Time to read: ~ 2 minutes

Don’t talk to me about it!

Four years ago (I know, where did the time go?), I wrote about Table Column Differences with T-SQL and PowerShell.

A Michal commented on the post, asking how to get a specific output from his search.

Hi,

Thanks for your sharing. What if I also want to compare case sensitively columns and the order of them (syncwindows). How can I presented it on powershell.

I mean that in the final table I want to show also something like: column_a, column_A โ€“> case sensitive

AND

column_a, column_a โ€“> different order in the table

Thanks in advance

Michal

I confess that I never got around to answering Michal until a few weeks ago when I found myself with some rare free time.

Since then, I’ve written a script, slapped it into a function, and threw it up on Github.

Here’s hoping that it does what you want this time Michal, thanks for waiting.

Shall I Compare Thee to Another Table?

The first thing that we need to do is have a couple of SQL tables to compare.

So, I threw up a Docker container and created a couple of tables with nearly the same layout.

(Get-DbaDbTable -SqlInstance localhost -Table 'dbo.DifferenceTable01', 'dbo.DifferenceTable02').Columns |
        Select-Object -Property Parent, Name, ID, DataType |
        Format-Table -GroupBy Parent
I’m liking the new PowerShell formatting

You can see that there are around three differences here

  1. Column orders, e.g. col9 has id 6 in dbo.DifferenceTable01 but id 5 in dbo.DifferenceTable02.
  2. Column case sensitivity, e.g. col7 does not match COL7.
  3. Column presence, e.g. col3 doesn’t exist in dbo.DifferenceTable01 at all.

While Compare-Object has the -CaseSensitive switch, I don’t think that it would be helpful in all these cases. Or else I didn’t want to use that command this time around.

So, I wrote a function to get the output we wanted, and yes, I now include myself among that list of people wishing for that output.

I’m allowed to be biased towards the things that I write ๐Ÿ™‚

Compare-SqlTableColumns

Compare-SqlTableColumns -SqlInstance localhost -Table1 'dbo.DifferenceTable01' -Table2 'dbo.DifferenceTable02' |
        Format-Table

I’ve tried to include everything you could want in the function output, i.e. column names, column ids, and statuses.

Something I’ve started to do lately is wrapping a [Diagnostics.StopWatch] in my verbose statement to see where potential slow parts of the function are.

I’d like to think that 0.2 seconds for this example aren’t too bad.

$x = Compare-SqlTableColumns -SqlInstance localhost -Table1 'dbo.DifferenceTable01' -Table2 'dbo.DifferenceTable02' -Verbose

$x | Format-Table

Thou hast less columns than thine brother…

Feel free to use and abuse this function to your hearts content. I know that there are a few things that I’d add to it. Comparing across different instances being an obvious one that I’d like to put in.

Hopefully though, someone out there will find it helpful.

Here’s looking at you, Michal.

Finding Parameters that do not match Column Names

You think this will take me hours? Ha! Think again.

Words: 437

Time to read: ~ 2 minutes

Script Link:ย https://github.com/shaneis/RandomScripts/blob/master/WhereParameterNameDoesNotMatchColumnName.ps1

Continue reading “Finding Parameters that do not match Column Names”

ARIGHTABORT-ing & Anti-ANSI_WARNINGS

I recently ran into a problem with the QUOTED_IDENTIFIERS option in SQL Server, and it got me to thinking about these SET options.

I mean the fact that, on tables where there are filtered indexes or computed columns with indexes, QUOTED_IDENTIFIER is required to be on to create any other indexes is just not intuitive. But if you can’t create indexes because of it then I’d argue that it’s pretty damn important! I also found out that this problem is not just limited to QUOTED_IDENTIFIERย but to ARITHABORTย and ANSI_WARNINGS as well.

Just check out the Microsoft Docs and what it has to say about it:

SET ARITHABORT must be ON when you are creating or changing indexes on computed columns or indexed views. If SET ARITHABORT is OFF, CREATE, UPDATE, INSERT, and DELETE statements on tables with indexes on computed columns or indexed views will fail.

And for ANSI_WARNINGSย it says:

SET ANSI_WARNINGS must be ON when you are creating or manipulating indexes on computed columns or indexed views. If SET ANSI_WARNINGS is OFF, CREATE, UPDATE, INSERT, and DELETE statements on tables with indexes on computed columns or indexed views will fail.

It’s not just Indexes

So, like a dog when it sees a squirrel, when I found out about the problems with ARITHABORTย and ANSI_WARNINGS I got distracted and started checking out what else I could break with it. Reading through the docs, because I found that it does help even if I have to force myself to do it sometimes, I found a little gem that I wanted to try and replicate. So here’s a reason why you should care about setting ARITHABORTย and ANSI_WARNINGS on.

Default to on

At one stage or another if you’re working with SQL Server, you’ve probably encountered the dreaded “Divide By 0” error:

Msg 8134, Level 16, State 1, Line 4
Divide by zero error encountered.

If you want to check this out, then here’s the code below for our table:


USE Pantheon;

-- Create our test table...
CREATE TABLE dbo.ArithAborting (
    id tinyint NULL
);
GO

And our attempt at inserting that value into the table:

SET ARITHABORT ON;
GO
SET ANSI_WARNINGS ON;
GO
-- Check can we insert a &quot;divide by 0&quot;...
BEGIN TRY
INSERT INTO dbo.ArithAborting (id) SELECT 1/0;
END TRY
BEGIN CATCH
  PRINT 'NOPE!';
  THROW;
END CATCH;

And we get our good, old, dreaded friend:

DivideByZeroError
Terminate!

We check our ArithAborting table and nothing is there, like we expected!

SELECT *
FROM dbo.ArithAborting;

EmptyTable
I got nothing…

What about if we were to turn our ARITHABORTย and ANSI_WARNINGSย off though, what happens then? Well that’s a simple thing to test, we just turn them off and run the script again:

--Turn ARITHABORT off;
SET ARITHABORT OFF;
GO
SET ANSI_WARNINGS OFF;
GO
-- ...insert into our table...
BEGIN TRY
  INSERT INTO dbo.ArithAborting (id) SELECT 1/0;
END TRY
BEGIN CATCH
  PRINT 'NOPE!';
  THROW;
END CATCH;

DivideByZeroWarnings
Termin-wait…

Now before I freak out and start thinking that I’ve finally divided by zero, let’s check the table:

NULLTable
I got NULL-ing

What’s going on here? Checking the docs

During expression evaluation when SET ARITHABORT is OFF, if an INSERT, DELETE or UPDATE statement encounters an arithmetic error, overflow, divide-by-zero, or a domain error, SQL Server inserts or updates a NULL value. If the target column is not nullable, the insert or update action fails and the user receives an error.

Do I like this?

Nope!

If I have a terminating error in my script, I quite like the fact that SQL Server is looking out for me and won’t let me put in bad data, but if you have these options turned off, even if you wrap your code in an TRY...CATCHย block, it’s going to bypass it.

Plus if you are trying to divide by 0, please stop trying to break the universe. Thank you.

 

 

Comparing Column Values in the Same Table

The Set-Up:

This is yet another time that a blog post has come about from a question by a developer. They’re good guys, I guess, they keep me on my toes.

This time it was with change logging. We didn’t have Change Data Capture (CDC), or Temporal Tables enabled (have you seen the YouTube videos by Bert Wagner ( blog | twitter ) on these?). What we did have was “manual logging” and no, I’m not even talking about Triggers.

What we had was INSERT statements, directly after a MERGE statement, that inserted into a table variable a hard-coded name of the column, the old value, and the new value.

Is that what I would do? Doesn’t matter, it was there before I got there, seems to work, and is low down on the list of priorities to change.

The question was, every time that they needed to add a column to a table, and change log it, they had to add multiple lines to the change tracking procedure and the procedure was getting gross and hard to maintain.

Something to do with DRYness?

Create Table:

You know the drill by now, I quite like to play along so let us facilitate that (from now on I’m going to use Gist, formatting with native WordPress is starting to annoy me).


USE Pantheon;
GO
DROP TABLE IF EXISTS dbo.GotAnyChange;
CREATE TABLE dbo.GotAnyChange
(
GotAnyChangeID int IDENTITY(1, 1),
Column1 int,
Column2 char(1),
Column3 date,
Column4 bigint,
Column5 varchar(50),
Column6 datetime
);
INSERT INTO dbo.GotAnyChange
( Column1,
Column2,
Column3,
Column4,
Column5,
Column6
)
VALUES
(0, — Column1 – int
'A', — Column2 – char(1)
GETDATE(), — Column3 – date
0, — Column4 – bigint
REPLICATE('A', 50), — Column5 – varchar(50)
GETDATE() — Column6 – datetime
);
SELECT *
FROM dbo.GotAnyChange;

This will create our table and, luckily, all of it’s columns are important enough to warrant capturing when they get changed.

FirstCreation
Despite their looks, these values are “important”

Old, Way WHERE old=way

Let’s take a look at the code that they were using, shall we?


USE Pantheon;
GO
DECLARE @Id int = 1
— ChangeLog data store
— Yes, I know that they're all varchars…
DECLARE @ChangeLogTemp table
(
ColumnName varchar(100),
OldValue varchar(20),
NewValue varchar(20)
);
— Old & New value data store
DECLARE @OutputTemp table
(
OldColumn1 int NOT NULL,
NewColumn1 int NOT NULL,
OldColumn2 char(1) NOT NULL,
NewColumn2 char(1) NOT NULL,
OldColumn3 date NOT NULL,
NewColumn3 date NOT NULL,
OldColumn4 bigint NULL,
NewColumn4 bigint NULL,
OldColumn5 varchar(50) NOT NULL,
NewColumn5 varchar(50) NOT NULL,
OldColumn6 datetime NULL,
NewColumn6 datetime NULL
);
— Actual update, outputting the old and new values
UPDATE gac
SET gac.Column1 = 1,
gac.Column2 = 'B',
gac.Column6 = GETDATE()
OUTPUT Deleted.Column1, Inserted.Column1,
Deleted.Column2, Inserted.Column2,
Deleted.Column3, Inserted.Column3,
Deleted.Column4, Inserted.Column4,
Deleted.Column5, Inserted.Column5,
Deleted.Column6, Inserted.Column6
INTO @OutputTemp
FROM dbo.GotAnyChange AS gac
WHERE gac.GotAnyChangeID = @Id;
— Capture changes. 1 for each column we want to capture
INSERT INTO @ChangeLogTemp
SELECT 'Column1',
OldColumn1,
NewColumn1
FROM @OutputTemp
WHERE ISNULL(OldColumn1, NewColumn1) <> NewColumn1;
INSERT INTO @ChangeLogTemp
SELECT 'Column2',
OldColumn2,
NewColumn2
FROM @OutputTemp
WHERE ISNULL(OldColumn2, NewColumn2) <> NewColumn2;
INSERT INTO @ChangeLogTemp
SELECT 'Column3',
OldColumn3,
NewColumn3
FROM @OutputTemp
WHERE ISNULL(OldColumn3, NewColumn3) <> NewColumn3;
INSERT INTO @ChangeLogTemp
SELECT 'Column4',
OldColumn4,
NewColumn4
FROM @OutputTemp
WHERE ISNULL(OldColumn4, NewColumn4) <> NewColumn4;
INSERT INTO @ChangeLogTemp
SELECT 'Column5',
OldColumn5,
NewColumn5
FROM @OutputTemp
WHERE ISNULL(OldColumn5, NewColumn5) <> NewColumn5;
INSERT INTO @ChangeLogTemp
SELECT 'Column6',
OldColumn6,
NewColumn6
FROM @OutputTemp
WHERE ISNULL(OldColumn6, NewColumn6) <> NewColumn6;
— Capture all changes as an XML
SELECT ColumnName AS CN,
OldValue AS OV,
NewValue AS NV
FROM @ChangeLogTemp
FOR XML PATH('Change'), ROOT('Changes');

And the results?

OldWayResult
XML anyone?

You can probably see the problem here.

Hey! It’s legacy code, let’s focus on just 1 problem at at time!

The main issue that I was asked about was every time a column was deemed important and needed to be added to the list, they had to insert another INSERT INTO @ChangeLogTemp...ย and they thought that it wasn’t sustainable in the long run.

Hmmm it also comes across as very RBAR doesn’t it? Every time we want to include another column to the change tracking, we have to add them row by agonizing row.ย The script is already big enough, if we keep adding more, it will get massive!

Set based is 90% of the time the right way to go but how do we do set based solutions on the same table?

New JOIN Way ON new = way

The first thing I do is to change that table variable into a temp table. Stats, indexes (if necessary), and I can query the results as we go along. Much better!

ChangeToTempTable
Temp > Variable?

The second thing is that, whether by luck or by design, the legacy code has the same naming conventions for the columns; new column values are have the prefix “New%” in the column name and old columns have the “Old%” prefix.
This works for us because we can now split the new columns into 2 derived tables, New and Old, and that way we have the differences.

PreUnPivotColumns
Potential problem here…

Have you ever tried to find the differences between two consecutive rows of data? It’s fiendishly difficult. WHERE Column1 on row1 != Column1 on row2ย apparently just does not work,ย le sigh.

I’ve talked before about PIVOTย but now I’m going to introduce you to it’s little brother, UNPIVOT, which “rotating columns of a table-valued expression into column values

I say “little brother” because the whole document talks about PIVOT, with only brief mentions of UNPIVOT in the notes.

If you’re writing documentation like this, please stop.

With UNPIVOTย we can create a table of our rows around our ID and Column names…

UnpivotedColumns
Potential problem averted!

… and with this, we can join on our ID and Column names and get to our more intuitive WHERE OldValue != NewValue.

Bringing it all together!


USE Pantheon;
GO
DECLARE @ChangeLogAsXml xml,
@Id int = 1;
DROP TABLE IF EXISTS #OutputTableNew;
CREATE TABLE #OutputTableNew
(
GotAnyChangeID int NOT NULL,
OldColumn1 int NOT NULL,
NewColumn1 int NOT NULL,
OldColumn2 char(1) NOT NULL,
NewColumn2 char(1) NOT NULL,
OldColumn3 date NOT NULL,
NewColumn3 date NOT NULL,
OldColumn4 bigint NULL,
NewColumn4 bigint NULL,
OldColumn5 varchar(50) NOT NULL,
NewColumn5 varchar(50) NOT NULL,
OldColumn6 datetime NULL,
NewColumn6 datetime NULL
);
UPDATE gac
SET gac.Column1 = 2,
gac.Column2 = 'C',
gac.Column6 = GETDATE()
OUTPUT inserted.GotAnyChangeID,
Deleted.Column1, Inserted.Column1,
Deleted.Column2, Inserted.Column2,
Deleted.Column3, Inserted.Column3,
Deleted.Column4, Inserted.Column4,
Deleted.Column5, Inserted.Column5,
Deleted.Column6, Inserted.Column6
INTO #OutputTableNew
FROM dbo.GotAnyChange AS gac
WHERE gac.GotAnyChangeID = @Id;
SELECT New.ColumnName AS CN,
New.NewValue AS NV,
Old.OldValue AS OV
FROM
(
SELECT Unpvt.GotAnyChangeID,
Unpvt.ColumnName,
Unpvt.NewValue
FROM
( SELECT [GotAnyChangeID],
CAST(ISNULL([NewColumn1], '') AS nvarchar(20)) AS [Column1],
CAST(ISNULL([NewColumn2], '') AS nvarchar(20)) AS [Column2],
CAST(ISNULL([NewColumn3], '') AS nvarchar(20)) AS [Column3],
CAST(ISNULL([NewColumn4], '') AS nvarchar(20)) AS [Column4],
CAST(ISNULL([NewColumn5], '') AS nvarchar(20)) AS [Column5],
CAST(ISNULL([NewColumn6], '') AS nvarchar(20)) AS [Column6]
FROM #OutputTableNew
) AS DataSource UNPIVOT(NewValue FOR ColumnName IN([Column1], [Column2], [Column3], [Column4], [Column5], [Column6])) AS Unpvt
) AS New
INNER JOIN
(
SELECT Unpvt.GotAnyChangeID,
Unpvt.ColumnName,
Unpvt.OldValue
FROM
( SELECT [GotAnyChangeID],
CAST(ISNULL([OldColumn1], '') AS nvarchar(20)) AS [Column1],
CAST(ISNULL([OldColumn2], '') AS nvarchar(20)) AS [Column2],
CAST(ISNULL([OldColumn3], '') AS nvarchar(20)) AS [Column3],
CAST(ISNULL([OldColumn4], '') AS nvarchar(20)) AS [Column4],
CAST(ISNULL([OldColumn5], '') AS nvarchar(20)) AS [Column5],
CAST(ISNULL([OldColumn6], '') AS nvarchar(20)) AS [Column6]
FROM #OutputTableNew
) AS DataSource UNPIVOT(OldValue FOR ColumnName IN([Column1], [Column2], [Column3], [Column4], [Column5], [Column6])) AS Unpvt
) AS Old
ON Old.ColumnName = New.ColumnName
AND Old.GotAnyChangeID = New.GotAnyChangeID
WHERE New.NewValue <> Old.OldValue
FOR XML PATH('Change'), ROOT('Changes');

And it works!

NewWayResult
wasn’t this replaced by JSON?

It’s not great though.

The whole thing was supposed to be to reduce the amount of changes required when they need to include or exclude columns. All in all though, it’s just 6 lines less. Not exactly the great return that you’d expect.
Yeah, true with the old way for every column we want to add we have to add an extra 6 lines while the new way adds 2.

That means for 1,024 columns:

  • The old way could have at least 6,144 lines per table. (1024 * 6)
  • The new way could have at least 2,048 lines per table (not explaining this calculation >:(ย )

So, is there anything else that we can do?

Dynamically?

I’ve talked before about T-SQL automation with Dynamic SQL and this should be a good candidate for that.

What can we make dynamic here though? How about…

  1. The new and old columns bit?
  2. The FOR ColumnName IN([Column1], [Column2], [Column3], [Column4], [Column5], [Column6])ย bit?
  3. The CAST(ISNULL([Old/NewColumn], '') AS nvarcharย bit?

Explain it to me.

  1. The new and old columns.

Well, temp tables exist in the tempdb database, they just get a suffix of a lot of underscores and a hex value.

So to get our column names, we can just query the sys.tablesย and sys.columnsย catalog views in [tempdb] and we should have what we need.

DynamicColumnsResults
We can add a filter clause too

2. The FOR ColumnName IN (

I’ve talked before about concatenating values so we can use that to generate this part of the script.

DynamicUnpivotColumnNames
LEN(tc.name) – 3 to remove the “old”/”new” prefix

3. The CAST(ISNULL(...

This is basically the same as the above. Don’t be put off by needing to add CAST(ISNULL(ย before the column names, it’s not as complex as you’d think.

DynamicNewColumnsSelect
STUFF just doesn’t look as pretty… ๐Ÿ™

Now that we have our dynamic bits, let’s create the full statements.

Full Dynamic Script


USE Pantheon;
GO
DROP TABLE IF EXISTS #OutputTableDynamic;
GO
CREATE TABLE #OutputTableDynamic
(
GotAnyChangeID int NOT NULL,
OldColumn1 int NOT NULL,
NewColumn1 int NOT NULL,
OldColumn2 char(1) NOT NULL,
NewColumn2 char(1) NOT NULL,
OldColumn3 date NOT NULL,
NewColumn3 date NOT NULL,
OldColumn4 bigint NULL,
NewColumn4 bigint NULL,
OldColumn5 varchar(50) NOT NULL,
NewColumn5 varchar(50) NOT NULL,
OldColumn6 datetime NULL,
NewColumn6 datetime NULL
);
DECLARE @ChangeLogAsXml xml,
@Id int = 1;
UPDATE gac
SET gac.Column1 = 3,
gac.Column2 = 'D',
gac.Column6 = GETDATE()
OUTPUT inserted.GotAnyChangeID,
Deleted.Column1, Inserted.Column1,
Deleted.Column2, Inserted.Column2,
Deleted.Column3, Inserted.Column3,
Deleted.Column4, Inserted.Column4,
Deleted.Column5, Inserted.Column5,
Deleted.Column6, Inserted.Column6
INTO #OutputTableDynamic
FROM dbo.GotAnyChange AS gac
WHERE gac.GotAnyChangeID = @Id;
DECLARE @DynamicSql nvarchar(MAX),
@NewColumns nvarchar(MAX),
@OldColumns nvarchar(MAX),
@columns nvarchar(MAX);
SET @NewColumns =
(
SELECT STUFF(
(
SELECT TOP (1024)
', CAST(ISNULL(' + QUOTENAME(RIGHT(tc.name, LEN(tc.name))) + ', '''') AS nvarchar(20)) AS '
+ QUOTENAME(RIGHT(tc.name, LEN(tc.name) – 3))
FROM tempdb.sys.tables AS tt
INNER JOIN tempdb.sys.columns AS tc
ON tt.object_id = tc.object_id
WHERE tt.name LIKE '#OutputTableDynamic%'
AND tc.name LIKE 'New%'
ORDER BY tc.column_id
FOR XML PATH('')
), 1, 2, ''));
SET @OldColumns =
(
SELECT STUFF(
(
SELECT TOP (1024)
', CAST(ISNULL(' + QUOTENAME(RIGHT(tc.name, LEN(tc.name))) + ', '''') AS nvarchar(20)) AS '
+ QUOTENAME(RIGHT(tc.name, LEN(tc.name) – 3))
FROM tempdb.sys.tables AS tt
INNER JOIN tempdb.sys.columns AS tc
ON tt.object_id = tc.object_id
WHERE tt.name LIKE '#OutputTableDynamic%'
AND tc.name LIKE 'Old%'
ORDER BY tc.column_id
FOR XML PATH('')
), 1, 2, '' ) );
SET @columns =
(SELECT STUFF(
( SELECT TOP (1024)
', ' + QUOTENAME(RIGHT(tc.name, LEN(tc.name) – 3))
FROM tempdb.sys.tables AS tt
INNER JOIN tempdb.sys.columns AS tc
ON tt.object_id = tc.object_id
WHERE tt.name LIKE '#OutputTableDynamic%'
AND tc.name LIKE 'New%'
ORDER BY tc.column_id
FOR XML PATH('')
), 1, 2, ''));
SET @DynamicSql =
N'SELECT New.ColumnName AS CN, New.NewValue AS NV, Old.OldValue AS OV FROM (SELECT Unpvt.GotAnyChangeId, Unpvt.ColumnName, Unpvt.NewValue FROM (SELECT [GotAnyChangeId], ' + @NewColumns
+ N' FROM #OutputTableDynamic ) AS DataSource UNPIVOT (NewValue FOR ColumnName IN (' + @columns
+ N') ) AS Unpvt ) AS New INNER JOIN ( SELECT Unpvt.GotAnyChangeId, Unpvt.ColumnName, Unpvt.OldValue FROM (SELECT [GotAnyChangeId], ' + @OldColumns
+ N' FROM #OutputTableDynamic ) AS DataSource UNPIVOT (OldValue FOR ColumnName IN (' + @columns
+ N')) AS Unpvt) AS Old ON Old.ColumnName = New.ColumnName AND Old.GotAnyChangeId = New.GotAnyChangeId WHERE New.NewValue != Old.OldValue FOR XML PATH(''Change''), ROOT(''Changes'')';
EXECUTE sp_executesql @DynamicSql;

Results are good!

DynamicWayResult
We’ve seen this before

Overall, the script is longer at nearly double the lines but where it shines is when adding new columns.
To include new columns, just add them to the table; to exclude them, just add in a filter clause.

So, potentially, if every column in this table is to be tracked and we add columns all the way up to 1,024 columns, this code will not increase.
Old way: at least 6,144.
New way: at least 2,048.
Dynamic: no change

Summary:

Like the script, this was a massive post. Back at the start, I said that a developer came to me because they wanted to get more DRY (?) and stop needing to add more content to the stored procedure.

Do you think the developer used this?

Nope!

I can’t say that I blame them, it’s slightly ugly and unwieldy, and I wrote it so I should love it.
Yet if something was to go wrong and the need was there to open the procedure and troubleshoot it, the first person to open this up is going to let out a groan of despair!

So this request turned into a proof of concept and nothing more. No skin off my back, I have a growing list of tasks to accomplish by 5 minutes ago. Better get back to them.