PIVOT in PowerShell

Words: 1151

Time to read: ~ 6 minutes

Apologies

I’m going to start this post off with an apology.

As Kevin Feasel ( Blog | Twitter ) mentioned about my last post Attempting SUM() OVER () in PowerShell:

It’d be a lot easier, though, with a properly normalized data model which includes date, attempt number, and push-ups in that attempt. Pivot those results at the end if you want this sort of report, but SQL is designed to work best with tables in first normal form or higher.

Kevin Feasel

I can’t very well give out to people for not doing the right thing first time, even if it’s more difficult, if I don’t do the right thing myself!

As Kevin mentioned, once the data was in a proper format, a format designed for SQL, the calculations were trivial.

However, outputting the results in the same way in PowerShell required a way to pivot results in PowerShell. Thanks to some heavy lifting from Joel Sallow ( Blog | Twitter ), I now know how to pivot in PowerShell!

Here’s hoping that this post will help explain it for you also.

Exploring our Data

SQL

First off, let’s check the current state of our table in SQL.

SELECT	POP.pushup_date,
		POP.attempt_number,
		POP.pushup_count,
		SUM(POP.pushup_count) OVER (PARTITION BY POP.pushup_date ORDER BY POP.pushup_date) AS total_per_date,
		SUM(POP.pushup_count) OVER () AS grand_total
FROM	dbo.PushupsOctoberProper AS POP;
SQL style!

Pivoting

I want to get all possible 8 attempts horizontal like the last post. I find this fairly easy when I have the documentation for PIVOTs open in another tab.

/* Can we pivot these? */
SELECT	PVT_01.pushup_date,
		[1] AS attempt_1,
		[2] AS attempt_2,
		[3] AS attempt_3,
		[4] AS attempt_4,
		[5] AS attempt_5,
		[6] AS attempt_6,
		[7] AS attempt_7,
		[8] AS attempt_8,
		PVT_01.total,
		PVT_01.total_so_far
FROM
(
	SELECT	POP.pushup_date,
			POP.attempt_number,
			POP.pushup_count,
			SUM(POP.pushup_count) OVER (PARTITION BY POP.pushup_date ORDER BY POP.pushup_date) AS total,
			SUM(POP.pushup_count) OVER () AS total_so_far
	FROM	dbo.PushupsOctoberProper AS POP
) AS SRC
PIVOT
(
	MAX(pushup_count) FOR attempt_number IN ([1], [2], [3], [4], [5], [6], [7], [8])
) AS PVT_01
ORDER BY	PVT_01.pushup_date;
Simples!

Simple, right? Once we have the data in the expected format then the above steps are the only steps necessary to calculate and show the data in the way that we want.

However, it becomes a bit more complicated in PowerShell.

PowerShell

Let’s grab the data from our SQL instance and take a look at it.

<# Populate our variable from the database #>
$invQueryParams = @{
    SqlInstance = $sqlInstance
    Database = 'LocalTesting'
    Query = 'SELECT * FROM dbo.PushupsOctoberProper;'
}
$data = Invoke-DbaQuery @invQueryParams

<# Show our data #>
$data | Format-Table -Autosize
So far, so good…

Grouping our Data

We have our data fetched, now we need to group it by the different dates. If only PowerShell had a way to group objects…what? Group-Object? oh!

<# Grouping our data #>
$dataGroups = $data | Group-Object -Property pushup_date
$dataGroups
Data.DataRow? * sigh* one of these days I’ll remember to use -AS PSObject with my Invoke-DbaQuery

Now that we have our data grouped by the different dates, we can loop through each date and pivot the data out horizontally.

Manual Pivot

The first way that came to mind was to manually list out all columns. I know that the maximum attempt_count that I have is 8 so let’s manually create 8 attempt columns.

<# Let's pivot this manually because it's the first way that came to mind #>
$ManualpivotedData = foreach ($dg in $dataGroups) {

    [PSCustomObject]@{
        pushup_date = ($dg.Group | Select-Object -ExpandProperty pushup_date -Unique).ToShortDateString()
        attempt01 = ($dg.Group | Where-Object attempt_number -eq 1).pushup_count
        attempt02 = ($dg.Group | Where-Object attempt_number -eq 2).pushup_count
        attempt03   = ($dg.Group | Where-Object attempt_number -eq 3).pushup_count
        attempt04   = ($dg.Group | Where-Object attempt_number -eq 4).pushup_count
        attempt05   = ($dg.Group | Where-Object attempt_number -eq 5).pushup_count
        attempt06   = ($dg.Group | Where-Object attempt_number -eq 6).pushup_count
        attempt07   = ($dg.Group | Where-Object attempt_number -eq 7).pushup_count
        attempt08   = ($dg.Group | Where-Object attempt_number -eq 8).pushup_count
        total = ($dg.Group | Measure-Object -Property pushup_count -Sum).Sum
        total_so_far = ($data | Measure-Object -Property pushup_count -Sum).Sum
    }
}

<# Let's make this pretty #>
$ManualpivotedData | Format-Table -Property pushup_date, @{ Expression = 'attempt*'; Width = 10 }, total, total_so_far
Seems to work

In case you’re wondering what @{ Expression = 'attempt*' ; Width = 10 } does, I use it to narrow the width of the columns named like attempt since they’re integers. Since they don’t need as much space, I can narrow them down and then Format-Table won’t cut-off my later columns!

Dynamic Pivot

I’m not against the manual way. I just find it too bulky and repetitve. It works! Please don’t get me wrong on that accout but as I recently heard someone say: “It works, now clean it up

Our main problem is the attempt columns and our manually typing them out. They seem like a perfect candidate for a ForEach loop. But, when we try to slot that in….

foreach ($dg in $dataGroups) {

    $props = @(
        @{ Name = 'pushup_date' ; Expression = { ($dg.Group | Select-Object -ExpandProperty pushup_date -Unique).ToShortDateString() }}
        foreach ($num in 1..8) {
            @{ 
                Name = "attempt_$num" 
                Expression = { $dg.Group |
                    Where-Object attempt_number -eq $num |
                    Select-Object -ExpandProperty pushup_count } 
            }
        }
        @{ Name = 'total' ; Expression = { ($dg.Group | Measure-Object -Property pushup_count -Sum).Sum } }
        @{ Name = 'total_so_far' ; Expression = { ($data | Measure-Object -Property pushup_count -Sum).Sum }}
    )

    $dg | Select-Object $props
}
Well that shouldn’t be empty!

Yeah, that’s a “no” from PowerShell. Why is this?

Why this is

Let’s investigate that $props variable. We’re creating a hashtable where the Key is our name and the Value is the expression we want. So let’s get the values.

$props | Format-List
Expression = $num

Do you see the way that each of the Expression keys have a value with the $num variable?

If you check $num now, you’ll see that it’s set to 8. It looks like we have found our problem, the $props variable isn’t keeping the value of $num when we define it!

Since only one date has a value for attempt 8, we should see some values there.

$pivotedData = foreach ($dg in $dataGroups) {

    $props = @(
        @{ Name = 'pushup_date' ; Expression = { ($dg.Group | Select-Object -ExpandProperty pushup_date -Unique).ToShortDateString() }}
        foreach ($num in 1..8) {
            @{ 
                Name = "attempt_$num" 
                Expression = { $dg.Group |
                    Where-Object attempt_number -eq $num |
                    Select-Object -ExpandProperty pushup_count } 
            }
        }
        @{ Name = 'total' ; Expression = { ($dg.Group | Measure-Object -Property pushup_count -Sum).Sum } }
        @{ Name = 'total_so_far' ; Expression = { ($data | Measure-Object -Property pushup_count -Sum).Sum }}
    )

    $dg | Select-Object $props
}

<# Let's check the 14th #>
$pivotedData | Where-Object pushup_date -eq '14/10/2020'
All filled but all with value for the 8th attempt!

Yeah…that’s not correct. I did 30 on the first attempt. Believe me, I remember the pain. Looks like it’s putting the value for attempt 8 into each of the attempts.

Not cool…

Closures

If only there was a way to keep the value of $num when we defined the $props variable. Well, thanks to Joel and his post ScriptBlocks and GetNewClosure(), I now know that there is!

$pivotedData = foreach ($dg in $dataGroups) {

    $props = @(
        @{ Name = 'pushup_date' ; Expression = { ($dg.Group | Select-Object -ExpandProperty pushup_date -Unique).ToShortDateString() }}
        foreach ($num in 1..8) {
            @{ 
                Name = "attempt_$num" 
                Expression = { $dg.Group |
                    Where-Object attempt_number -eq $num |
                    Select-Object -ExpandProperty pushup_count }.GetNewClosure()
            }
        }
        @{ Name = 'total' ; Expression = { ($dg.Group | Measure-Object -Property pushup_count -Sum).Sum } }
        @{ Name = 'total_so_far' ; Expression = { ($data | Measure-Object -Property pushup_count -Sum).Sum }}
    )

    $dg | Select-Object $props
}

<# Let's make this pretty #>
$pivotedData | Format-Table -Property pushup_date, @{ Expression = 'attempt*'; Width = 10 }, total, total_so_far
It’s alive!!!!!

Summary

There’s nothing wrong with making mistakes; as long as you learn from them.

Thanks to Kevin for reminding me how things should be stored, and thanks to Joel for this (vast) knowledge sharing, I’ve been able to learn how to dynamically pivot in PowerShell from my mistakes.

Review your mistakes, you never know what you may learn.

Attempting SUM() OVER () in PowerShell

Words: 891

Time to read: ~ 5 minutes

Pro-Cras-Tin-Ation!

Like most things in life, this piece of work came about while attempting to complete something else. It’s not a bad thing, I expect it at this stage.

Easy Like Sunday Morning

I find it easy to get the total of a row in SQL. Hell, when it is not particularly important, I’ll even go the easy route and use a calculated column in the table.

CREATE TABLE dbo.PushupsOctober
(
	pushup_date date NOT NULL
		CONSTRAINT PK_PushupsOctober PRIMARY KEY CLUSTERED,
	attempt_01 tinyint NULL,
	attempt_02 tinyint NULL,
	attempt_03 tinyint NULL,
	attempt_04 tinyint NULL,
	attempt_05 tinyint NULL,
	attempt_06 tinyint NULL,
	attempt_07 tinyint NULL,
	attempt_08 tinyint NULL,
	total_pushups_per_day AS (ISNULL(attempt_01, 0) + ISNULL(attempt_02, 0) + ISNULL(attempt_03, 0) + ISNULL(attempt_04, 0) + ISNULL(attempt_05, 0) + ISNULL(attempt_06, 0) + ISNULL(attempt_07, 0) + ISNULL(attempt_08, 0))
);
GO

Then, all I have to do is insert the data and SQL will automatically take care of calculating the total per row for me.

INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/07/2020 00:00:00', 20, 20, NULL, NULL, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/08/2020 00:00:00', 20, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/09/2020 00:00:00', 20, 20, 25, NULL, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/10/2020 00:00:00', 25, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/11/2020 00:00:00', 0, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/12/2020 00:00:00', 25, 25, 25, 25, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/13/2020 00:00:00', 20, 15, 15, 25, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/14/2020 00:00:00', 30, 30, 20, 20, 25, 20, 20, 20);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/15/2020 00:00:00', 25, 25, 25, 25, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/16/2020 00:00:00', 25, 25, NULL, NULL, NULL, NULL, NULL, NULL);
INSERT INTO dbo.PushupsOctober (pushup_date, attempt_01, attempt_02, attempt_03, attempt_04, attempt_05, attempt_06, attempt_07, attempt_08) Values('10/17/2020 00:00:00', 25, NULL, NULL, NULL, NULL, NULL, NULL, NULL);
GO

SELECT	*
FROM	dbo.PushupsOctober;
GO
Why count when not need to?

Once you have the total per row, you throw in a SUM(that total) OVER () and you have a grand total. Thank you to Kevin Wilkie ( blog | twitter ) for re-igniting my curiosity about Window Functions again.

SELECT	*,
		SUM(p.total_pushups_per_day) OVER () AS total_so_far 
FROM	dbo.PushupsOctober AS p;
GO
Total total

Easy Like Monday Morning

PowerShell is a different beast. Please don’t get me wrong; I still love the language. I don’t find it easier to get a row total and then a grand total though.

It’s possible! I’m just hoping that there is a better way. Saying all that here is my attempt at a row total and grand total using PowerShell.

If you have a better way (you choose the conditions that satisfy “better”) please let me know.

Grabbing the Data

First, let’s grab the data from the table in our database.

$data_2 = Invoke-DbaQuery -SqlInstance localhost -Database LocalTesting -Query @'
SELECT * FROM dbo.PushupsOctober;
'@

Removing Unwanted Properties

Here’s where I remembered that I had a calculated column, realised that it would be cheating to use it and decided it needed to go. Thankfully, this also enabled me to get rid of those pesky columns that get returned from Invoke-DbaQuery when you forget the parameter -As PSObject!

$data_2 = $data_2 | Select * -ExcludeProperty RowError, RowState, Table, ItemArray, HasErrors, total_pushups_per_day

Grabbing Property Names

There’s a couple of things that we need here. We need a way to add up all the “attempt” columns so we need a way to select them all.

$props = $data_2[0].PSObject.Properties | Where-Object Name -like 'attempt*' | Select-Object -ExpandProperty Name

There, that should do nicely!

Grabbing the Values for those Properties

Now, we can iterate over them and get all the values.

foreach ($p in $props) { $data_2[0] | Select -ExpandProperty $p }

Potential Problem

See all those empty lines? Yep, that’s a potential problem for Measure-Object.

$hasToBeAnEasierWay = foreach ($p in $props) { $data_2[0] | Select -ExpandProperty $p }
$hasToBeAnEasierWay | Measure-Object -Sum

Removing NULL or WhiteSpace

Thankfully, there’s a way to get rid of those empty lines.

$hasToBeAnEasierWay | Where-Object { -Not [String]::IsNullOrWhiteSpace($_) } | Measure-Object -Sum

Putting it ALL Together

Now that we have the skeleton of a script ready, let’s put it all together.

Row total

$data_2 | ForEach-Object -Begin {
    $props = $data_2[0].PSObject.Properties | Where-Object Name -like 'attempt*' | Select-Object -ExpandProperty Name
} -Process {
    $total = $null
    $hasToBeAnEasierWay = $null

    $hasToBeAnEasierWay = foreach ($prop in $props) {
        $_ | Select-Object -ExpandProperty $prop
    }
    $total = ($hasToBeAnEasierWay | Where-Object { -Not [String]::IsNullOrWhiteSpace($_) } | Measure-Object -Sum).Sum

    $_ | Select-Object -Property *, @{
        Name = 'total_per_day'
        Expression = { $total }
    }, @{
        Name = 'days_left'
        Expression = { ((Get-Date -Date '2020-10-31') - (Get-Date -Date $_.pushup_date)).Days }
    }
} -OutVariable data_3 | Format-Table -Autosize

We now have the row total in our total_per_day property. And, with our use of -outvariable data_3, we have the results saved into a variable called $data_3 .

Grand Total

Once we have a single column that we can sum up to give us our grand total, then PowerShell makes this operation trivial.

I do have to use Format-List here because Format-Table can’t fit all the properties in so our new property total_so_far won’t show up.

$data_3 | Select-Object -Property *, @{
    Name = 'total_so_far'
    Expression = { ($data_3 | Measure-Object -Property total_per_day -Sum).Sum }
} | Format-List

There We Go!

While, I’d argue that it’s not as easy as SQL, it’s completely possible to get row totals and grant totals in PowerShell.

Honestly though, I hope there’s an easier way. Otherwise, I’m going to do it in SQL and then grab it out into PowerShell afterwards.

T-SQL Tuesday #131: Data Analogies, or: Explain Databases Like I’m Five!

Words: 906

Time to read: ~ 5 minutes

Welcome to T-SQL Tuesday! This month’s host is Rob Volk (blog|twitter) and the topic is Data Analogies.

Analogies

Come in! Come in, my boy!
Now, your father sent you to me to explain what I used to do.
Well, I was a DBA, my boy!
What do you mean “what does that do”? I took care of the databases! I made sure that they were looked after, that they had vim and vigour, and that they didn’t go down.

What? No, I don’t mean “down-down”, it’s not a direction! I mean that they didn’t go offline. No, not “off-off”, well not quite… It was my duty to ensure that they were available. Got it? No?

Database Administration

Well, let’s take that whiskey cabinet over there Jas…Jaeysin. Let us say that the whiskey cabinet is our database, and it is our job to see that it is never empty. We do that by lots of ways; by checking on it and making sure that nothing is alarming, like empty bottles. We check that there is space if someone wants to add more whiskey and that anyone who wants something from it, can get it. Like me, hehe.

What? You don’t understand how that’s like being a DBA? Well think about it my boy, I would check on the databases, make sure nothing was alarming, and that the data was always available for whoever wanted it.

Security

What’s that? You want some? Ho ho, my boy, you are one for jests. I tell you what, try opening the cupboard door on the left. Yes, the one where you can see all the sticks and cherries through the glass. Not a problem for you, was it? Put back the cherry, please. And wipe your hands…NOT ON THE GLASS!
Nevermind, my boy, nevermind, I shouldn’t have put the soda water in a dangerous place like that…inside a cupboard…away from the ledge. Try and open the right cupboard door now. Yes, the one with the fancy bottles and the shiny lights. Yes, I’m aware it’s locked. Now see, you cannot open that door but I can because I have permission to open that door.
That was another part of my job, making sure that people had the right permission to get what they wanted, and that people without permission could not.

What’s that? Who decides on the permissions? Well, back then, it was a business decision. Where those higher up in standing would pick, and I would have to follow their instructions. Now, I’m in charge.

What do you mean that’s not what your father says? Right, well, I’ll be having a few words with him, and we’ll see more about this “under his partner’s thumb” business. No, I can’t open it either. Because I don’t have the key. Yeah well, I may be db_owner, but not sysadmin… Nevermind.

Performance

What else did I do? Well, I made sure those who wanted data from the database knew where they could get the data and could get the data promptly.
Well, do you see the whiskey cabinet? Yes, the one on the right. Yes, the one you’re not allowed open. Yes, you’re allowed to look…oi! Quit your cheek, or I’ll add that to the list of things I’ll have to talk to your father about.
Now, if someone were to go to that cabinet wanting a nice Scotch, they only have to reach into the middle shelf, and they would have a whole choice of Scotch to choose from. I know that because I know that my middle shelf is only for Scotch.
Same with the databases; I helped people to create tables to hold their similar data the same way I choose that shelf to have my Scotch.

And see the way that the bottles get fancier as you scan your eyes from left to right. I ordered that shelf, so the most expensive bottles were all the way over there on the right. And, the least expensive bottles are all over to the left.
Same with the databases; I would create indexes so that people could go to what they wanted, whether that be the “expensive bottles” or not.
No more looking through the entire table for what they want, they knew exactly where to go and knew when they could stop looking as well.

What? Indexes, my boy, an index. No, it hasn’t nothing to do with your finger! Wait! You may be on to something there. Tell me, have you ever run your index finger down a phone book? A phone book. You’ve never heard of it? Hold on…this thing? Ever seen this? Big yellow book, lots of information? Yes, I know we have Google, nevermind.

Redundancy

Was that it? No lad, that was not it. The world of data is vast and open. So much more than what an analogy using a whiskey cabinet can provide. But I will leave you with one more analogy; Redundancy. It’s always essential to have a redundant copy of your data, a recovery plan if a disaster were ever to strike.
Open that desk drawer for me, my boy. Yes, it’s another bottle of whiskey. A redundant backup, if you would. Now I believe I see your father pull into the drive, so it is time for you to leave. I shall go see if my backup can still be restored.

Goodbye, you little disaster.

ODBC date/time Extension Option in SQL Server

Words: 329

Time to read: ~ 2 minutes

At this stage of my career with Transact-SQL, I would like to think that I have the language mostly mapped. If there is something that seems new or unusual, then it often comes down to the developers marking the code to make replacing it more effortless.

In case you don’t grasp what I’m on about, the below piece of code shows it pretty well.

SELECT N'##datetime##' AS search_and_replace;

<clickbait>I had never seen this before</clickbait>

Can you imagine my shock when I came across a piece of code that not only was not for finding and replacing but even though I did not think it would compile, it did!

If you can imagine my shock, then you’re going to need to increase it more when I tell you that there are a whole family of the same functions!
Here is the code that threw me for a loop the first time I saw it.

SELECT {d '1970-01-01'};

Researching

There are more than just {d '<datetime>'} as well!

SELECT {d '2020-01-01'} AS sample_date,
       {t '12:31:56'} AS sample_time,
       {ts '2020-01-01 12:31:56'} AS sample_datetime;

There’s a whole page in Microsoft documentation about them.

It’s not just date and time functions but string function as well.

SELECT LEN(N'This string is 33 characters long') AS string_length,
       {fn BIT_LENGTH(N'This string is 33 characters long')} AS string_bit_length,
       {fn OCTET_LENGTH(N'This string is 33 characters long')} AS string_nvarchar_octet_length,
       {fn OCTET_LENGTH('This string is 33 characters long')} AS string_varchar_octet_length,
       DATALENGTH(N'This string is 33 characters long') AS string_nvarchar_datalength,
       DATALENGTH('This string is 33 characters long') AS string_varchar_datalength;

From whence camest thou?

My Google-Bing-Fu located these functions in the following link.
As well as a few answers on StackOverflow regarding querying SQL Server and A.N.Other datastore.

So now that I think I’ve grasped where they came from, will I be using these going forward?

No, thanks.

Oh, and in case you were wondering how I got the “ODBC date/time extension option” information, good error messages are irreplaceable.

SELECT {date '2020-01-01'} AS sample_date;

‘date’ is not a recognized ODBC date/time extension option.