As part of my post for the latest T-SQL Tuesday, I talked about community scripts. One of these scripts was by Jana Sattainathan ( blog | twitter ) and it was to do with finding tables with similar table structure. This is a great script for initially finding the tables and something that I am not sure that I would have thought of!
However, do me a favour and in your favourite search engine, do a search for “stackoverflow sql server wide tables” (Google search included for your benefit).
For some reason, people have a slight fascination with wwwwwiiiiiiiiddddeeeee tables!
So thanks to the excellent work done by Jana, you have now identified the tables with similar structure, but what about if we want to know which column names match exactly?
I recently had this problem with consolidating copies of tables of medium to wide length in a database and needed to know. So I created a little snippet of code to help me out and I present it to you, in all of it’s unashamed vulnerability, in the hope that it can help you out.
Be warned: Thar be PIVOTs ahead!
Ahoy, matey!
So we’ve got two tables that are slightly different, not completely different (hey, they could be) but different enough to be similar without matching… (differently similar?)
Our task is to find out which columns are in both tables, which columns are only in 1 table, and which columns are only in the other.
Now this isn’t so bad manually, especially if you only need to do this once, maybe twice. What happens though if you need to do it with multiple pairs of tables? Or multiple pairs of wide tables like our search engines showed us above?
So let us do what all DBAs should do when they have a repetitive, manual task stopping them from doing more fun important things: Automate it away!
Avast Ye!
Our two tables are as follows:
CREATE TABLE dbo.DifferenceTable01 ( col1 int, col2 int, col4 int, col6 int ); GO CREATE TABLE dbo.DifferenceTable02 ( col2 int, col3 int, col4 int, col5 int ); GO
Now we can use the sys.columns
table to check out the different columns in the table but the results are, well, meh
SELECT OBJECT_NAME([object_id]) AS TableName, [name] AS ColumnName FROM sys.columns WHERE [object_id] IN ( OBJECT_ID(N'dbo.DifferenceTable01', N'U'), OBJECT_ID(N'dbo.DifferenceTable02', N'U') ); GO

That is not really going to work out for us…
So I’m not liking the look of this, and going through the results, it seems to me that these results are just not useful. This isn’t the computers fault – it’s done exactly what I’ve told it to do – but a more useful result would be a list of columns and then either a simple ‘Yes’, or a ‘No’.
There’s syntax for this…PIVOT
…
Thar She Blows!
As anyone who has seen me dance can attest to, I can neither shake, rattle, nor roll. And I definitely do not normally PIVOT
. However, as I’m trying to know my tools, I do know that this is the syntax that I need.
PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output.
So after looking up the syntax for this once (ok, 5 times!) I managed to come out with a script that I’m reasonably happy with.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SELECT Pivot1.ColumnName, | |
Pivot1.[dbo.DifferenceTable01], | |
Pivot1.[dbo.DifferenceTable02], | |
CASE WHEN [dbo.DifferenceTable01] = 1 AND [dbo.DifferenceTable02] = 1 | |
THEN 'Both' | |
WHEN [dbo.DifferenceTable01] = 1 AND [dbo.DifferenceTable02] IS NULL | |
THEN 'Table 1 only' | |
WHEN [dbo.DifferenceTable01] IS NULL AND [dbo.DifferenceTable02] = 1 | |
THEN 'Table 2 only' | |
ELSE 'Eh…this should not really happen' | |
END AS HumanReadableFormat | |
FROM ( SELECT | |
c.[name] AS ColumnName, | |
tb.TableName, | |
1 AS ColumnExists | |
FROM sys.columns AS c | |
RIGHT JOIN ( VALUES | |
(OBJECT_ID(N'dbo.DifferenceTable01', N'U'), 'dbo.DifferenceTable01'), | |
(OBJECT_ID(N'dbo.DifferenceTable02', N'U'), 'dbo.DifferenceTable02') | |
) AS tb (ObjectID, TableName) | |
ON c.object_id = tb.ObjectID | |
) AS UnPivotedColumns | |
PIVOT ( | |
MAX(ColumnExists) FOR TableName IN ([dbo.DifferenceTable01], [dbo.DifferenceTable02]) | |
) AS Pivot1 | |
ORDER BY Pivot1.ColumnName ASC; | |
GO |
And the results are a lot easier to read 🙂

So much better! This way, no matter how long the tables, I can easily figure out what columns are in what table(s) based on their names.
Shiver Me Timbers!
Isn’t it always the way after you’ve done something, you realise a much easier way to do it?
The Old Seadog!
I’ve talked about automation in this post and I have yet to mention PowerShell. I bow my head in shame.
Especially when it has a native command like Compare-Object
.
help Compare-Object -ShowWindow
Synopsis
Compares two sets of objects.Description
The Compare-Object cmdlet compares two sets of objects. One set of objects is the “reference set,” and the other set is the “difference set.”The result of the comparison indicates whether a property value appeared only in the object from the reference set (indicated by the <= symbol), only in the object from the difference set (indicated by the => symbol) or, if the IncludeEqual parameter is specified, in both objects (indicated by the == symbol).
If the reference set or the difference set is null ($null), this cmdlet generates a terminating error.
So the question we have to ask ourselves now is “Can we do what we did with the PIVOTs easier?”
The Old Salt.
I’ve recently found out about splatting so, of course, I’ve started to use it EVERYWHERE!
Let’s “splat” our two parameters
$Table01Param = @{ ServerInstance = 'localhost\SQLDEV2K14' Database = 'master' Query = "SELECT OBJECT_NAME([object_id]) AS TableName, [name] AS ColumnName FROM sys.columns WHERE [object_id] = OBJECT_ID(N'dbo.DifferenceTable01', N'U');" } $Table02Param = @{ ServerInstance = 'localhost\SQLDEV2K14' Database = 'master' Query = "SELECT OBJECT_NAME([object_id]) AS TableName, [name] AS ColumnName FROM sys.columns WHERE [object_id] = OBJECT_ID(N'dbo.DifferenceTable02', N'U');" }
And we now save ourselves the trouble of writing the parameters to the functions.
Invoke-Sqlcmd @Table01Param Invoke-Sqlcmd @Table02Param

Since everything is now set up, we can just pass those results into 2 different variable holders and use our Compare-Object
.
$Table01 = Invoke-Sqlcmd @Table01Param $Table02 = Invoke-Sqlcmd @Table02Param Compare-Object -ReferenceObject $Table01 -DifferenceObject $Table02 -Property ColumnName -IncludeEqual

And for anyone saying “Yeah, but who knows what ‘SideIndicator’ means!”, I advise you to read the help with PowerShell. It helps a-lot!
The result of the comparison indicates whether a property value appeared only in the object from the reference set (indicated by the <= symbol), only in the object from the difference set (indicated by the => symbol) or, if the IncludeEqual parameter is specified, in both objects (indicated by the == symbol).
If you are still complaining about it – “You can’t have the HumanReadableFormat column like you did in T-SQL” – then please stop. There are experts out there who can make you go “wow” with what they do. I’m still learning but even I have an answer to that.
Compare-Object -ReferenceObject $Table01 -DifferenceObject $Table02 -Property ColumnName -IncludeEqual | Select-Object -Property *,@{N='HRF';e={switch ($_.SideIndicator) {'==' {'Both'} '=>' {'Table 2 only'} '<=' {'Table 1 only'} } }}

Land-ho!
I’m not trying to argue with who would win between T-SQL and PowerShell, that was definitely not my intention with this post. T-SQL is my first language love, PowerShell is very quickly becoming my second (and that’s not just because it’s the only other langauge I know).
They both accompolish the same thing!
It’s just a matter of preference, that’s all, not a matter of competition. You want to work in SSMS, PIVOT
it up! You like PowerShell, hammer that Compare-Object
nail!
Whatever works for you.
My first idea for this was T-SQL but it turns out for me that PowerShell is the easier option. Don’t rule out one, just because the other was the first thing to pop into your head.
Now I’m going to go before I make a joke about Pirates, SQL and the R language…
Using built-in views seems more straightforward i.e
declare @table1 nvarchar(50)=’dbo.DifferenceTable01′, @table2 nvarchar(50)=’dbo.DifferenceTable02′
select * from
(
select Column_Name, ‘Yes’ as Table1, ‘No’ as table2 from
(
select column_name from INFORMATION_SCHEMA.columns where table_name = @table1
except
select column_name from INFORMATION_SCHEMA.columns where table_name = @table2
) tbl1
union all
select Column_Name, ‘No’ as Table1, ‘Yes’ as table2 from
(
select column_name from INFORMATION_SCHEMA.columns where table_name = @table2
except
select column_name from INFORMATION_SCHEMA.columns where table_name = @table1
) tbl2
union all
select Column_Name, ‘Yes’ as Table1, ‘Yes’ as table2 from
(
select column_name from INFORMATION_SCHEMA.columns where table_name = @table1
intersect
select column_name from INFORMATION_SCHEMA.columns where table_name = @table2
) tbl12
) tblall order by COLUMN_NAME
Absolutely! And I’m a fan of scripts that use
INTERSECT
andEXCEPT
.Couple of things though.
1. I’m not the biggest fan of using the
INFORMATION_SCHEMA
views. That’s a personal preference thing.2. I think you accidentally left out the
PARSENAME
bit of your script.Otherwise you won’t get any results back. (and yes, I should probably throw in the schema names as well but it’s a comment!)
I haven’t done this yet but I’d be interested to see the results of
STATISTICS IO, TIME
on our queries, as well as the execution plans. Especially if we created a very wide table! Your query may kick my one’s metaphorical backside then – although your execution plan will probably have a concatenation so it could be larger 😉I’d count this as blog post material…feel like writing one up and letting me know when it’s available? 🙂
Hi,
Thanks for your sharing. What if I also want to compare case sensitively columns and the order of them (syncwindows). How can I presented it on powershell.
I mean that in the final table I want to show also something like: column_a, column_A –> case sensitive
AND
column_a, column_a –> different order in the table
Thanks in advance