Using Multiple Delimiters to Split Strings with PowerShell

I appear to be going for the Ronseal titles lately…

Words: 725
Time to read: ~4 minutes

Intro

It is extremely difficult to find an arrogant personality in the SQL Server community. Nearly everyone I’ve talked to, interacted with, or read about suffers from a form of “imposter syndrome“.
Which isn’t necessarily a bad thing; people suffering from imposter syndrome tend to put a lot more effort into their work and constantly try to improve their skills.
However, there is a worry that people cannot be happy within this state. There is a worry that they cannot see the improvements that they have achieved. That’s why I think it’s very important to take a moment each month and just reflect on anything that you have improved on.
My latest reflection is a small thing but it’s still an improvement so it counts. Especially since it’s a nice easy trace of an improvement from fear and raw effort to a modicum of familiarity.

Latest reflection

If you’ve seen this a line like this, then you have seen the log output of a SQL Server Agent job.
Processing database: [DBName] @ 2017-11-13 05:07:18 [SQLSTATE 01000]
Processing database: [Database] @ 2017-11-13 05:27:39 [SQLSTATE 01000]
This is great because we have a log of what database was actioned and when it was actioned.
But what about if there are multiple databases being actioned in the same job? What about if we are trying to parse the log files and want to get the different database names from these lines?
Here’s the code for playing along at home:
Including the WordPress format because WordPress doesn’t want to open Gists anymore, for some reason :/

$Database = 'Processing database: [DBName] @ 2017-11-13 05:07:18 [SQLSTATE 01000]', 'Processing database: [Database] @ 2017-11-13 05:27:39 [SQLSTATE 01000]'

foreach ($Db in $Database)
{
[PSCustomObject]@{
OriginalLine = $Db
IndexOfOpeningSquare = $Db.IndexOf('[')
IndexOfClosingSquare = $Db.IndexOf(']')
}
}

#Substrings
#1
foreach ($Db in $Database)
{
$Db.Substring($Db.IndexOf('[') + 1)
}
#Finished
foreach ($Db in $Database)
{
$Db.Substring($Db.IndexOf('[') + 1, ($Db.IndexOf(']') - $Db.IndexOf('[')) - 1)
}

#split nested arrays
#1
foreach ($Db in $Database)
{
($Db -split '\[')
}
#2
foreach ($Db in $Database)
{
($Db -split '\[')[1]
}
#3
foreach ($Db in $Database)
{
($Db -split '\[')[1] -split '\]'
}
#Finished
foreach ($Db in $Database)
{
(($Db -split '\[')[1] -split '\]')[0]
}

#first multiple delimiter split
#1
foreach ($Db in $Database)
{
($Db -split {$_ -eq '[' -or $_ -eq ']'})
}
#Finished
foreach ($Db in $Database)
{
($Db -split {$_ -eq '[' -or $_ -eq ']'})[1]
}

#Final method
#1
foreach ($Db in $Database)
{
($Db -split '[\[\]]')
}
#Finished
foreach ($Db in $Database)
{
($Db -split '[\[\]]')[1]
}

Let’s get some basic information about our strings, shall we?

StringInfo
Here’s our strings…

Substring

Coming from a SQL Server background, you can bet that I tried to use SUBSTRING for this!

First, let’s see if we can get the start of the database name?

Substrings01
IndexOf is a PowerShell form of CHARINDEX

And now, we can get the rest!

SubstringF
That’s a very long expresssion…

-Split

After a quick glance at Get-Help about_Split I moved onto using the -split operator

First, what happens when we split our string on ‘[‘?

arrays1
Have to use \[ because it takes regex!
Right, well we only want the 2nd result of each so let’s grab only that!

arrays2
1 = 2 remember?

Can we split that string on ‘]’ to get the rest?

arrays3
that’s looking a lot better!

And we only want the first result for each so we filter it to that!

arraysF
And 0 = 1 (you’ll remember, don’t worry)

Scriptblock?

Slow your horses Shane, read about_Split again…

<String> -Split {<ScriptBlock>} [,<Max-substrings>]

Wait, it takes a script block? No way! Could this mean that we can split on two different delimiters?

multi1
It works! By Jove, it works!

Apply filter and…

multiF
Lovely!

Ranges!

Can we go further? Scriptblocks are great but can we do better? Remember with -Split we had to escape the ‘[‘ because of regex? Well let’s use an extremely basic form of regex. You’ve even seen it with SQL Server!

Final1
That’s right, ranges = [ ]
We filter this to get the 2nd result of each…

FinalF
I’ll admit [ \[ \] ] just looks strange
And we have our database names!

Moving on to next reflection

Let’s just compare the first script with the last script, shall we?


#Finished
foreach ($Db in $Database)
{
$Db.Substring($Db.IndexOf('[') + 1, ($Db.IndexOf(']') - $Db.IndexOf('[')) - 1)
}

#Finished
foreach ($Db in $Database)
{
($Db -split '[\[\]]')[1]
}

Personally I prefer the second one, brevity wins out overall, plus reading this it just seems easier to understand.

Okay…we’re taking the index of ‘[‘ adding 1…what?…in the string…but only until the index of ‘[‘…what?…minus the…what? the original index?!…taking away 1??? Then why are we adding it!!!

Versus

Okay…what the hell?…oh it’s a range, I’ve seen them with LIKE in T-SQL…splitting on ‘[‘ and ‘]’. Alright!

Reflection done. Time to improve.

Author: Shane O'Neill

DBA, T-SQL and PowerShell admirer, Food, Coffee, Whiskey (not necessarily in that order)...

Leave a Reply

%d bloggers like this: