Time to read: ~ 2 minutes
You think this will take me hours? Ha! Think again.
You think this will take me hours? Ha! Think again.
Time to read: ~ 2 minutes
I recently ran into a problem with the
QUOTED_IDENTIFIERS option in SQL Server, and it got me to thinking about these
I mean the fact that, on tables where there are filtered indexes or computed columns with indexes,
QUOTED_IDENTIFIER is required to be on to create any other indexes is just not intuitive. But if you can’t create indexes because of it then I’d argue that it’s pretty damn important! I also found out that this problem is not just limited to
QUOTED_IDENTIFIER but to
ANSI_WARNINGS as well.
Just check out the Microsoft Docs and what it has to say about it:
SET ARITHABORT must be ON when you are creating or changing indexes on computed columns or indexed views. If SET ARITHABORT is OFF, CREATE, UPDATE, INSERT, and DELETE statements on tables with indexes on computed columns or indexed views will fail.
ANSI_WARNINGS it says:
SET ANSI_WARNINGS must be ON when you are creating or manipulating indexes on computed columns or indexed views. If SET ANSI_WARNINGS is OFF, CREATE, UPDATE, INSERT, and DELETE statements on tables with indexes on computed columns or indexed views will fail.
So, like a dog when it sees a squirrel, when I found out about the problems with
ANSI_WARNINGS I got distracted and started checking out what else I could break with it. Reading through the docs, because I found that it does help even if I have to force myself to do it sometimes, I found a little gem that I wanted to try and replicate. So here’s a reason why you should care about setting
At one stage or another if you’re working with SQL Server, you’ve probably encountered the dreaded “Divide By 0” error:
Msg 8134, Level 16, State 1, Line 4 Divide by zero error encountered.
If you want to check this out, then here’s the code below for our table:
USE Pantheon; -- Create our test table... CREATE TABLE dbo.ArithAborting ( id tinyint NULL ); GO
And our attempt at inserting that value into the table:
SET ARITHABORT ON; GO SET ANSI_WARNINGS ON; GO -- Check can we insert a "divide by 0"... BEGIN TRY INSERT INTO dbo.ArithAborting (id) SELECT 1/0; END TRY BEGIN CATCH PRINT 'NOPE!'; THROW; END CATCH;
And we get our good, old, dreaded friend:
We check our ArithAborting table and nothing is there, like we expected!
SELECT * FROM dbo.ArithAborting;
What about if we were to turn our
ANSI_WARNINGS off though, what happens then? Well that’s a simple thing to test, we just turn them off and run the script again:
--Turn ARITHABORT off; SET ARITHABORT OFF; GO SET ANSI_WARNINGS OFF; GO -- ...insert into our table... BEGIN TRY INSERT INTO dbo.ArithAborting (id) SELECT 1/0; END TRY BEGIN CATCH PRINT 'NOPE!'; THROW; END CATCH;
Now before I freak out and start thinking that I’ve finally divided by zero, let’s check the table:
What’s going on here? Checking the docs…
During expression evaluation when SET ARITHABORT is OFF, if an INSERT, DELETE or UPDATE statement encounters an arithmetic error, overflow, divide-by-zero, or a domain error, SQL Server inserts or updates a NULL value. If the target column is not nullable, the insert or update action fails and the user receives an error.
If I have a terminating error in my script, I quite like the fact that SQL Server is looking out for me and won’t let me put in bad data, but if you have these options turned off, even if you wrap your code in an
TRY...CATCH block, it’s going to bypass it.
Plus if you are trying to divide by 0, please stop trying to break the universe. Thank you.
This is yet another time that a blog post has come about from a question by a developer. They’re good guys, I guess, they keep me on my toes.
This time it was with change logging. We didn’t have Change Data Capture (CDC), or Temporal Tables enabled (have you seen the YouTube videos by Bert Wagner ( blog | twitter ) on these?). What we did have was “manual logging” and no, I’m not even talking about Triggers.
What we had was
INSERT statements, directly after a
MERGE statement, that inserted into a table variable a hard-coded name of the column, the old value, and the new value.
Is that what I would do? Doesn’t matter, it was there before I got there, seems to work, and is low down on the list of priorities to change.
The question was, every time that they needed to add a column to a table, and change log it, they had to add multiple lines to the change tracking procedure and the procedure was getting gross and hard to maintain.
Something to do with DRYness?
You know the drill by now, I quite like to play along so let us facilitate that (from now on I’m going to use Gist, formatting with native WordPress is starting to annoy me).
This will create our table and, luckily, all of it’s columns are important enough to warrant capturing when they get changed.
Let’s take a look at the code that they were using, shall we?
And the results?
You can probably see the problem here.
Hey! It’s legacy code, let’s focus on just 1 problem at at time!
The main issue that I was asked about was every time a column was deemed important and needed to be added to the list, they had to insert another
INSERT INTO @ChangeLogTemp... and they thought that it wasn’t sustainable in the long run.
Hmmm it also comes across as very RBAR doesn’t it? Every time we want to include another column to the change tracking, we have to add them row by agonizing row. The script is already big enough, if we keep adding more, it will get massive!
Set based is 90% of the time the right way to go but how do we do set based solutions on the same table?
The first thing I do is to change that table variable into a temp table. Stats, indexes (if necessary), and I can query the results as we go along. Much better!
The second thing is that, whether by luck or by design, the legacy code has the same naming conventions for the columns; new column values are have the prefix “New%” in the column name and old columns have the “Old%” prefix.
This works for us because we can now split the new columns into 2 derived tables, New and Old, and that way we have the differences.
Have you ever tried to find the differences between two consecutive rows of data? It’s fiendishly difficult.
WHERE Column1 on row1 != Column1 on row2 apparently just does not work, le sigh.
I say “little brother” because the whole document talks about PIVOT, with only brief mentions of UNPIVOT in the notes.
If you’re writing documentation like this, please stop.
UNPIVOT we can create a table of our rows around our ID and Column names…
… and with this, we can join on our ID and Column names and get to our more intuitive
WHERE OldValue != NewValue.
Bringing it all together!
And it works!
It’s not great though.
The whole thing was supposed to be to reduce the amount of changes required when they need to include or exclude columns. All in all though, it’s just 6 lines less. Not exactly the great return that you’d expect.
Yeah, true with the old way for every column we want to add we have to add an extra 6 lines while the new way adds 2.
That means for 1,024 columns:
So, is there anything else that we can do?
I’ve talked before about T-SQL automation with Dynamic SQL and this should be a good candidate for that.
What can we make dynamic here though? How about…
FOR ColumnName IN([Column1], [Column2], [Column3], [Column4], [Column5], [Column6])bit?
CAST(ISNULL([Old/NewColumn], '') AS nvarcharbit?
Well, temp tables exist in the tempdb database, they just get a suffix of a lot of underscores and a hex value.
So to get our column names, we can just query the
sys.columns catalog views in [tempdb] and we should have what we need.
FOR ColumnName IN (
I’ve talked before about concatenating values so we can use that to generate this part of the script.
This is basically the same as the above. Don’t be put off by needing to add
CAST(ISNULL( before the column names, it’s not as complex as you’d think.
Now that we have our dynamic bits, let’s create the full statements.
Results are good!
Overall, the script is longer at nearly double the lines but where it shines is when adding new columns.
To include new columns, just add them to the table; to exclude them, just add in a filter clause.
So, potentially, if every column in this table is to be tracked and we add columns all the way up to 1,024 columns, this code will not increase.
Old way: at least 6,144.
New way: at least 2,048.
Dynamic: no change
Like the script, this was a massive post. Back at the start, I said that a developer came to me because they wanted to get more DRY (?) and stop needing to add more content to the stored procedure.
Do you think the developer used this?
I can’t say that I blame them, it’s slightly ugly and unwieldy, and I wrote it so I should love it.
Yet if something was to go wrong and the need was there to open the procedure and troubleshoot it, the first person to open this up is going to let out a groan of despair!
So this request turned into a proof of concept and nothing more. No skin off my back, I have a growing list of tasks to accomplish by 5 minutes ago. Better get back to them.
If this was a horror movie, it would be called “The Differencing”…duh duh duh!
The original post for this topic garnered the attention of a commenter who pointed out that the same result could be gathered using a couple of
UNION ALLs and those lovely set-based
I personally think that both options work and whatever you feel comfortable with, use that.
It did play on my mind though of what the performance differences would be…what would the difference in
STATISTICS IO, TIME be? What would the difference in Execution Plans be? Would there even be any difference between the two or are they the same thing? How come it’s always the things I tell myself not to forget that I end up forgetting?
I have no idea about the last one but at least the other things we can check. I did mention to the commentor that I would find this an interesting blog topic if they wanted to give it a go and get back to me. All I can say is – Sorry, your mail must have got lost in transit. I’m sure it is a better blog post that mine anyway.
For this test, we’re not going to stop at a measely 4 columns per table. Oh no! For this one we’re going to go as wide as we can.
With a recent post by Kenneth Fisher ( blog | twitter ) out about T-SQL FizzBuzz, I’m going to create two tables, both of which will have incrementing column names i.e. col00001, col00002, …, col1024. Table1 will have all columns divisible by 3 removed while Table2 will have all columns divisible by 5 removed.
See, FizzBuzz can be useful!
So our table creation scripts…
SELECT TOP (1024) CASE WHEN v.number = 0 -- Change this to 02 the second run through THEN N'CREATE TABLE dbo.TableColumnDifference01 (' ELSE N' col' + RIGHT(REPLICATE('0', 8) + CAST(v.number AS nvarchar(5)), 4) + N' int,' END FROM master.dbo.spt_values AS v WHERE v.type = N'P' AND ( -- Change this to '% 5' the second run through v.number % 3 != 0 OR v.number = 0) FOR XML PATH('')
NOTE: When you copy and paste the results of this query into a new window to open it, it is going to fail. Why? Well the end of the script is going to be along the lines of
colN int, and it needs to be
colN int). Why is it like this? Well it was taking to damn long to script that out. Feel free to change this to work for you. Hey if you do, let me know!
Now, how I’m going to do test this, is run each method 3 times (
PowerShell), then measure the third run of each method. This is mainly as I want to get rid of any “cold cache” issues with SQL Server where the plan has to be compiled or the data brought into memory.
So first up is the Pivot method from the last blog post. In case you’re playing along at home (and go on, do! Why should kids get all the fun) here is the code that I’m running.
And here is our results:
What we are really after though is the stats, execution plan and time to complete for our 3rd execution. Now as much as I love reading the messages tab for the stats information, I feel with blog posts that aesthetics is king, so I’m going to be using the free tool by Richie Rump ( twitter ) “Statistics Parser”
Secondly we have what I dubbed “the UNION method” (no points for figuring out why) and the only change I’ve made to this script is to add in
PARSENAME() and that’s only so that the script would..you know…work.
Results be like:
hmm…less Scan Counts but 5 times the reads…also 5 times slower than the
PIVOT method. Maybe the execution plan will be prettier?
Yeah…so…that’s…that’s different from the first plan! I was right in my comment though, there is a concatenation operator (there’s actually 2, you may need to zoom in to find them though)
Finally we have the PowerShell method. No messing about here, let’s get straight to it! I’m going to lump all the code together in one gist and I’ll be wrapping it in
Measure-Command to get the speed of the command.
Would you believe that I couldn’t figure out how to get an execution plan for PowerShell 🙂
If anybody knows, hit me up!
You know at the start of this, I was fully expecting the PowerShell to win out, followed by the UNION method, because it’s use of
INTERSECT which are basically made for this kind of problem, and the PIVOT method bringing up a distant last since PIVOTs have this complexity stigma attached to them and what is complex is normally slow.
From a sheer speed point of view, the actual results are:
I don’t think this is the end of my use of PowerShell or Union operators though. I’m not going to replace all the stuff that I can with Pivots. For one I just think that PowerShell and the Union operators are just too cool!
I actually like this result for two reasons.
Now if you’ll excuse me, I want to figure out if there’s a way to return execution plans with PowerShell.
Straight away I want to apologise for the Nicolas Cage memes!
I just got back from the Reading User Group and I’m still in that post “User Group Glow”, also known as “Long Day Lethargy”, or “Twelve Hour Tiredness”.
They are great though! A chance to talk to other people in the SQL Server community, – a slight reminder that even if you work alone, people are still experiencing some of the same problems that you are (apparently everyone has to deal with multiple nested views, who knew!) – a chance to hear presentations on different topics, and pizza if you’re lucky (we were).
I realised during the session that the two presentations given during the User Group had a connection with a small issue with a table change I had been given with a developer.
Here’s what did not happen to me so you can watch out for it.
Raul Gonzalez ( blog | twitter ) was first up with this presentation “Database Design Matters, Seriously”, showing us the chaos that can occur from not giving some serious thought into how you design your database.
His session is not yet up on his blog as I’m writing this but it will be soon so keep an eye out for that!
Now he had a lot of good points but, for brevity’s sake, the main chaos theory points here are what happens if you don’t take advantage of
FOREIGN KEY CONSTRAINTS, and not specifying a columns
NULLABILITY (yes, that’s a word!). SQL Server is a powerful program with many performance optimizations provided for you, but it’s not omniscient; it can only use the information that you give it!
His points on
NULLABILITY (I mean, I think it’s a word) tied in nicely with the next presentation…
David Morrison ( blog | twitter ) followed up with his presentation on “Query Plan Deep Dives” (I had seen this at SQL Bits, but it’s a great session so I had no problems watching it again) and, as an aside, through his presentation he showed us the compound effects that can happen from not specifying a columns
NULLABILITY (it’s got letters so it’s word-like…)
Now his slides and scripts are up on his blog and they do a great job of walking you through them so check them out and you’ll see the compound effects they create!
Here’s a little teaser…
-- now I want all people who's email isn't in the email table SELECT /*C.FirstName , C.LastName ,*/ C.EmailAddress FROM dbo.Contact AS C WHERE C.EmailAddress NOT IN (SELECT E.EmailAddress FROM dbo.Emails AS E) GO
Which brings us back around to consequences or as I like to put it “How I Pissed Off A Dev By Refusing A Simple Request”.
To be quite honest, it was a simple request. A requirement came in to expand a column datatype up to varchar(100), so one of devs wrote up a simple script and passed it onto the DBAs to check as part of the change control procedure.
ALTER TABLE tablename ALTER COLUMN columnname varchar(100)
And I said no.
“WHY???!!!“, you may shout at me (he certainly did), but I’m going to say to you what I said to him. “Give me a chance to explain before you take my head off, alright?”
While there is nothing wrong with the above code syntactically (is that a word?) but I couldn’t approve it since that column was originally
NOT NULL and the above script would have stripped the column of that attribute! Business requirements dictated that it should not allow NULLS, and hey, who are we to argue with that 😐
Double checking to see if the column is NULL or NOT NULL allowed me to see a problem with that code, one that many people would consider simple enough to just allow it through at a quick glance. Which could have opened up problems further down the line if it had run…
Thanks to the User Group, I now know that it could have a knock on effect with our query plans as well!
ALTER TABLE tablename ALTER COLUMN columnname varchar(100) NOT NULL
There, that’s better!
DBAs get a lot of stick sometime, the “Default Blame Acceptors” or the “Don’t Bother Asking” but a lot of the time, it’s not that we want to say no, it’s just that we have to take into consideration a thousand little things that could snowball into 1 giant problem.
With the rise of DevOps, check out the latest T-SQL Tuesday, DBAs have gone from going
to somewhere along the lines of
“Not this second, let me check it out and see what we can do”
If pressed further, we may rely on the good, old “it depends” though. Hey, clichés are there for a reason; they work!
It just goes to show that, like the IT profession, DBAs are constantly evolving.
Continuosly learning, checking out new helping technologies, and going to User Groups are going to help us to deal with it.
Just remember, in the end,
P.S. I should probably mention that the Nicolas Cage memes are because of this blog post by Nate Johnson ( blog ) that I enjoyed so much that I had to do something in response. I’m not normally this crazy, I swear!
Where I compare scripts to BBQ because of course I would 😐
I have this personal opinion that one sign of a good DBA is their ability to automate things and, before the DBA world found PowerShell, the way to do this was with T-SQL.
For example, a T-SQL script to get permissions assigned to a database principal could also include a column to
REVOKE those permissions. This could be “automated” with some dynamic SQL.
SELECT dprin.name AS DatabasePrincipalName, OBJECT_NAME(dperm.major_id) AS ObjectName, dperm.permission_name AS PermissionName, N'REVOKE ' + dperm.permission_name + N' ON OBJECT::' + OBJECT_NAME(dperm.major_id) + N' FROM ' + dprin.name COLLATE Latin1_General_CI_AS AS RevokeMe FROM sys.database_permissions AS dperm INNER JOIN sys.database_principals AS dprin ON dperm.grantee_principal_id = dprin.principal_id WHERE dprin.name = 'public';
This is nice but what about if we are
paranoid forward-thinking enough to realize that this could cause us problems?
“How?” You ask. Well what happens if there existed another database, say [NeedsAllPermissions], with the same table name and the same login has permissions on it.
Are you going to revoke permissions from that database? It needs ALL of them! It says so in the name!
So in an effort to not shoot ourselves in the foot, we add in the database name to our revoke script.
SELECT dprin.name AS DatabasePrincipalName, OBJECT_NAME(dperm.major_id) AS ObjectName, dperm.permission_name AS PermissionName, N'USE ' + DB_NAME() + 'GO' + N'REVOKE ' + dperm.permission_name + N' ON OBJECT::' + OBJECT_NAME(dperm.major_id) + N' FROM ' + dprin.name COLLATE Latin1_General_CI_AS AS RevokeMe FROM sys.database_permissions AS dperm INNER JOIN sys.database_principals AS dprin ON dperm.grantee_principal_id = dprin.principal_id WHERE dprin.name = 'public';
Yes, we’re only using our database now!
Until the day comes when you actually want to revoke permissions to that user. So you run the above code, copy the RevokeMe column and paste it into the management window. and you get…
GO is a special little guy. It’s not exactly T-SQL. It’s a way of telling the SQL Server Management Studio (SSMS) to send everything before it, from the beginning of the script or the preceding
GO, to the SQL Server instance.
If you read the documents, the main point to take away is…
A Transact-SQL statement cannot occupy the same line as a GO command. However, the line can contain comments.
GO is a special little snowflake and needs to be on his own line then. Simple enough if you know that SQL Server converts
CHAR(10) into a New Line.
If you didn’t know that, well you know that now….P.S.
CHAR(13) is a carriage return 😉
So let’s update our script with some
CHAR(10) and see what happens then.
SELECT dprin.name AS DatabasePrincipalName, OBJECT_NAME(dperm.major_id) AS ObjectName, dperm.permission_name AS PermissionName, N'USE ' + DB_NAME() + CHAR(10) + 'GO' + CHAR(10) + N'REVOKE ' + dperm.permission_name + N' ON OBJECT::' + OBJECT_NAME(dperm.major_id) + N' FROM ' + dprin.name COLLATE Latin1_General_CI_AS AS RevokeMe FROM sys.database_permissions AS dperm INNER JOIN sys.database_principals AS dprin ON dperm.grantee_principal_id = dprin.principal_id WHERE dprin.name = 'public';
Now, when we paste the RevokeMe column to a new window, we get…
…absolutely no difference. 🙂
Here, around 500 words in, we get to the meat of our post. How do we keep new lines when copying in SQL Server?
Tools | Options | Query Results | Results to Grid | Retain CR/LF on copy or save
Two things need to be done here.
New window open, we run our script again, and this time, when we copy and paste the results, we get…
So if you are using T-SQL to create scripts, and you’re having this problem with
GO or just new lines in general, make sure that the “retain CR/LF on copy and save” checkbox is ticked.
Now, improve that script more, throw it in a stored procedure, and you never know, it may be semi-useful. 🙂
I’m going to point people to this that have “My Group By isn’t working” questions…
Did you know that the SQL language allows you to do amazing analysis of data such as aggregate functions?
SELECT t.session_id, t.request_id, SUM(t.user_objects_alloc_page_count) AS UserObjectAllocated, SUM(t.user_objects_dealloc_page_count) AS UserObjectDeallocated FROM sys.dm_db_task_space_usage AS t GROUP BY t.session_id, t.request_id;
But…if you forget to put in the
GROUP BY clause, as a ski instructor once said, you’re going to have a bad time!
So some eager yet lost scholar ventures into this land of aggregate functions, reads the error message and adds in a
GROUP BY clause.
SELECT t.session_id, t.request_id, SUM(t.user_objects_alloc_page_count) AS UserObjectAllocated, SUM(t.user_objects_dealloc_page_count) AS UserObjectDeallocated FROM sys.dm_db_task_space_usage AS t GROUP BY t.session_id;
Now don’t scoff, this happens. I mean the error message is still red, looks nearly identical to the original one encountered, and can cause a rage-inducing damnation of SQL Server error messages.
Trawling the bulletin boards, question sites, and forums – okay maybe a quick question online, it’s called poetic exaggeration people! – they eventually learn the folly of their ways and correct their mistake.
SELECT t.session_id, t.request_id, SUM(t.user_objects_alloc_page_count) AS UserObjectAllocated, SUM(t.user_objects_dealloc_page_count) AS UserObjectDeallocated FROM sys.dm_db_task_space_usage AS t GROUP BY t.session_id, t.request_id;
I’m not going to talk about “Save and Recover Lost Tabs” – saved my ass many times.
I’m not going to talk about “Code Formatting” – saved my sanity many times.
I’m going to talk about “Autocomplete”.
A well-known secret with SQL Prompt’s autocomplete is the snippets feature. With this, you can increase your productivity by 75% from typing out
G R O U P [space] B Y and instead use
gb and hit tab.
Do not get me wrong, a 75% increase in productivity? I’ll take that!
That is a well-known secret though, and it’s slightly hard to get excited about a well-known secret.
However, what if I told you that SQL Prompt had another lesser-known secret that can increase your productivity and ensure that you do not forgot to add the necessary columns to your
GROUP BY clause?
Interested? Ah c’mon!
You sure you’re not interested?…. That’s better!
So first of all, let us increase the number of non-aggregated columns in our
SELECT to include database_id, is_remote_work, and exec_context_id. Including our session_id and request_id these are all columns that we are going to need to add to our
GROUP BY clause, because…well…business logic.
Only problem is ain’t nobody got time for that.
SQL Prompt knows this and adds the following little snippet after a
GROUP BY autocomplete.
Hitting tab on that includes everything in the
SELECT that is not part of an aggregate function, leaving us to concern ourselves with loftier things…
Now I don’t work for
pizza RedGate, I’m not affiliated with them, and I don’t get any money off of them. In fact, I’d say that they’d happily pay me not to write about them but when I found this autocomplete feature, I got too happy not to share it!
So save yourself the trouble of typing everything out and spare yourself the pain of error messages.
Use this lesser-known secret and have more time for pizza.
Words: 349 Reading Time: ~1.5 minutes.
Recently I was asked to create a temporary user with SELECT permissions on a database.
So far, not a problem. Taking advantage of the pre-defined roles in SQL Server, I just add this new user to the pre-defined role [db_datareader], which grants SELECT permissions to all tables and views in a database.
Why would I grant SELECT permissions this way and not manually choose the tables and views that this user could connect to?
Why grant SELECT on tables individually when I can grant on all tables in 1 fell swoop?
In the same vein, hypothetically speaking, if I was asked to grant SELECT permissions on 96 out of 100 tables, I would GRANT SELECT on all of them and then DENY SELECT on the 4 required as long as no column-level GRANTs have been given on those tables.
A recent notion that came to me was that one of the roles of a DBA is to gather knowledge, but to a level that promotes efficiency.
Sure, we know how to grant permissions, but we should also know the pitfalls, such as “deny beats grant unless the grant is on the column level” or “there are some combinations of permissions that allow more than intended“.
Knowing these caveats allows us to say when options can be automated or where rules need to be added to check for different statuses.
This will allow us to move on to the next aspect that needs a DBA’s eye and gentle guiding touch…or 2 cups of coffee and a full throttling !
When full uniqueness is a bit too much.
What are Unique Constraints? How can we enforce them and finally, is there anyway that we can have modified uniqueness…can we have Semi Unique Constraints?
There are many use cases that require that a column in a table only have unique values. The main example of this is a table’s
PRIMARY KEY which, needing to uniquely identify a record, needs to be unique by default.
In fact, SQL Server enforces this uniqueness by creating a unique index on whatever column(s) the Primary key is defined on.
-- Drop Table if Exists. IF OBJECT_ID(N'dbo.SemiUniqueColumn', N'U') IS NOT NULL DROP TABLE dbo.SemiUniqueColumn; -- Create Table. CREATE TABLE dbo.SemiUniqueColumn ( ID int IDENTITY(1, 1), UniqueColumn int, SemiUniqueColumn int, CONSTRAINT [PK_SemiUniqueColumn] PRIMARY KEY CLUSTERED (ID) ); -- Check Primary Key Exists. EXEC sp_helpindex N'dbo.SemiUniqueColumn';
Primary keys are not the only options that can be unique, SQL Server recognizes this, and so there is the option of marking other columns as unique as well. Whether this be actioned by a
UNIQUE CONSTRAINT or a
UNIQUE INDEX is user’s perogative.
I’ll be creating this using a
UNIQUE INDEX but for the purpose of completeness, the syntax for
UNIQUE CONSTRAINTs is
ALTER TABLE table_name ADD CONSTRAINT constraint_name UNIQUE (column(s))
Now say we are forced to ensure that the column
UniqueColumn is unique, so we create a
UNIQUE INDEX on this.
-- Ensure UniqueColumn is Unique by Creating Unique Index on it. CREATE UNIQUE NONCLUSTERED INDEX [UIX_SemiUniqueColumn_UniqueColumn] ON dbo.SemiUniqueColumn (UniqueColumn);
We insert values into this table and, as long as the uniqueness of these rows are satisfied, we’re going to have a good time.
-- Insert Data. WITH RecordSet (SeqNum) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM sys.all_columns) INSERT INTO dbo.SemiUniqueColumn (UniqueColumn) SELECT SeqNum FROM RecordSet ORDER BY SeqNum DESC; -- Select Data. SELECT ID, UniqueColumn, SemiUniqueColumn FROM dbo.SemiUniqueColumn;
Short answer, when you try and compare it to another one.
Now you may think that I’ve lost the plot and gone off on a tangent but bear with me here a second and I’ll explain.
A NULL is UNKNOWN and an unknown value should not equal another unknown value. What happens if your first unknown turns out to be the number 1 and the second unknown is the number 2? 1 <> 2 so comparing them then would be pretty foolish.
If you ever look at some generated script and see at the very top of the script, hiding amongst the
XACT_ABORTs and the
NOCOUNTs is another option called
ANSI_NULLs (although not anymore as it’s on by default and should be left that way IMHO).
This ensures that NULLs are treated as unknown and cannot be compared to another unknown except for some specific cases (
Good question! Remember our Unique Index on UniqueColumn? What happens if we run the following?
-- Insert NULL into UniqueColumn. INSERT INTO dbo.SemiUniqueColumn ( UniqueColumn ) VALUES ( NULL );
It inserts no problem, going all the way from 7442, 7441, … 2, 1, NULL. What about if we run it again?
Now ask yourself the question, what happens if we are told that our other column
SemiUniqueColumn can have as many NULLs as it wants but if it gets a value, that value must be unique?
-- Generate Semi Unique Values. -- Every Sixteenth Value is NULL. UPDATE original SET SemiUniqueColumn = IIF((original.UniqueColumn % 16) = 0, NULL, original.UniqueColumn) FROM dbo.SemiUniqueColumn AS [created] JOIN dbo.SemiUniqueColumn AS [original] ON created.ID = original.ID; -- Select Data. SELECT ID, UniqueColumn, SemiUniqueColumn FROM dbo.SemiUniqueColumn;
I used to think that this would be a complex requirement, possibly requiring a
TRIGGER or two to check the inserted value against whatever is already there; but there is a way to have this functionality and have it the way that SQL Server normally would enforce a uniqueness on a column; by using a
In case you’re thinking…
“Oh, a unique index doesn’t check what’s already there, is that it?”
I’m afraid that’s not the case.
-- Standard Unique Index Treats NULLs = NULLs. CREATE UNIQUE NONCLUSTERED INDEX [UIX_SemiUniqueColumn_SemiUniqueColumn] ON dbo.SemiUniqueColumn (SemiUniqueColumn);
Yes, I know that we just showed that SQL Server treats NULLs as equal on Indexes but there is a way around this, and that’s to not use a simple unique index.
We simply ignore the NULLs altogether by using a
UNIQUE FILTERED INDEX.
-- Filtered Index. CREATE UNIQUE NONCLUSTERED INDEX [UFIX_SemiUniqueColumn_SemiUniqueColumn] ON dbo.SemiUniqueColumn (SemiUniqueColumn) WHERE SemiUniqueColumn IS NOT NULL;
Don’t trust me? Trust but verify!
First attempt to insert a duplicate value in SemiUniqueColumn:
-- Test it (Duplicate). INSERT INTO dbo.SemiUniqueColumn (UniqueColumn, SemiUniqueColumn) VALUES (0, 7439);
Now we attempt to insert a duplicate value in SemiUniqueColumn but attempt to insert a duplicate NULL.
-- Test it (Not Duplicate). INSERT INTO dbo.SemiUniqueColumn (UniqueColumn, SemiUniqueColumn) VALUES (0, NULL);
Finally, NULL was chosen just for example purposes, filtering can be done on any value but, at the moment, there are a couple of got’cha’s with them…
I think you’ll agree that this would be so much easier to manage and troubleshoot than multiple triggers.
In fact, indexes are extremely useful aspects of databases and, once you start to realise how they work, you can start to use them for various different things than simple seeks and scans.
But the main aspect here is that complexity is not king. There are a multitude of ways to achieve something, so take a few moments to think about the problem you are trying to face and you’ll be thankful in the long run.
I am pretty sure that if I was a fish, I would not survive long enough to grow old as I would fall for the first piece of bait hanging from a lovely, shiny thing that I could see.
The only defence that I have is that, as I’m still a Junior DBA, I can make these mistakes as long as
a). they are not extremely serious (no dropping production databases for this guy!), and
b). I’m expected to learn from them and not repeat them!
And like most things, it started innocently enough. A simple support ticket coming in with the following error message.
Msg 229, Level 14, State 5, Line 65
The SELECT permission was denied on the object ‘Removable’, database ‘LocalTesting’, schema ‘Superflous’.
I saw this error message and immediately thought to myself
AH! No problems, they just need SELECT permissions on that object. 2 second job.
And seeing as the ticket was nice enough to provide the login and user that was receiving the error message (we’ll say it was a user called “NewUser”), I could join that with the error message and grant permissions.
GRANT SELECT ON OBJECT::Superflous.Removable TO NewUser;
Following this was a quick test to impersonate myself as the user and see if it works;
-- Test 01. EXECUTE AS USER = 'NewUser'; SELECT USER_NAME(), SUSER_SNAME(); SELECT * FROM dbo.GenericView;
As far as I was aware, I was happy it worked; the user, once notified, was happy it worked and I went on my merry way to grab some celebratory coffee.
Until on the way back I bumped into my Senior DBA and told him proudly what I had done…
The following is a simplified reproduction of that conversation…
>Is that a new View?
>> Is that a new User?
> No…although it’s called New.
>> Could they SELECT from that View before?
> Yeah, as far as I know.
>> Alright, so did anything change before the call?
> eh…I didn’t check
>> Okay, from now on: Check.
It was at that stage that we started getting other tickets in from other users with the same error message. So rather than fixing the underlying problem, I had fixed a symptom for a single user.
The symptom was the User not having permission to select, but the underlying problem was that the View had changed.
At this stage I was still confused as it’s a view, what does it matter if the query creating it has changed, how could this have broken permissions?
Again, jumping the gun, I didn’t check…
Our problem view has two different schemas and when we check the ownership of the two different schemas, we get the following:
-- Who owns what? SELECT dp.name AS Owner, s.* FROM sys.schemas AS s JOIN sys.database_principals AS dp ON s.principal_id = dp.principal_id WHERE s.name in ('dbo', 'Superflous');
Technically, the answer is Ownership Chains.
Superflous.Removable table was in a different database on it’s
dbo schema where the owner of the view (
dbo) had permissions to select from.
Since the owner of the view (OV) had permissions on this schema and the OV gave select permissions on the view to the user (NU), the NU inherited the OV’s permissions.
dbo.Foo, saw that it was owned by OV and so didn’t need to check permissions.
Now we had recently done a change to have the information from the other database brought over to our database via Replication.
This meant a re-write of our View using the new table and schema with it’s new owner. This new schema that our NU or the OV did not have permissions for.
What this meant was the same procedure was followed by the SQL Server engine with the only difference being that, instead of going across to the other database, it went to our new schema
Superflous.Removable . It saw the OV did not have access permissions, so it denied access permissions for our NU.
So basically, when
NewUser went to select from our view, they hit the new schema, SQL Server realised it needed to check their permissions and, when none were found, access was denied.
All I had done by jumping the gun and fixing the symptom was made it so that when SQL Server traversed down the ownership chain for the view and came to the new schema, it checked permissions, found the SELECT permission for only this user and continued on.
This was the reason that the view worked for the user but no one else!
MyStuff database principal should not be the owner of our Removable table, in fact the
Superflous schema should not even exist, so it was a simple matter of transferring ownership to
ALTER AUTHORIZATION ON SCHEMA::Superflous TO dbo;
Now all the users, who have read access on the
dbo schema, are able to use this view with no further hassles.
Problem solved! Right?
All the above is what I did.
Trying to fix the permission error, I granted SELECT permission.
Trying to fix the ownership chain, I transferred ownership.
Mainly in trying to fix the problem, I continually jumped the gun.
Which is why I am still a Junior DBA.
What my Senior DBA did was fix the replication script so the new schema wouldn’t get created in the first place, and the table would get created in
Which is why he’s my Senior DBA.
Jumping the gun isn’t going to give you a head start. It is just going to delay you. Knowing the problems, as well as knowing the solutions, is the answer.
I’m learning the problems…I’ll have the solutions soon, and I aim to share them too.