Words: 494
Time to read: ~ 3 minutes
A conversation…
…popped up in chat about JOINs
based on the following table samples.
The premise…
…was that you had to join both Table2 and Table3 to Table 1.
The catch…
…was you had to return all of Table1’s contents but only the contents of Table2 and Table3 where they match.
This meant that a composite join, a LEFT JOIN
with multiple ANDs
wasn’t going to cut it…
The Standards…
…were quickly brought out, with the use of a CTE and a derived table (which is kinda the same thing)
A Wild Right Join appeared…
…that worked as well which is strange since I’ve never seen a RIGHT JOIN
used in real life before apart from in Reporting.
That was it…
…until someone mentioned…
that’d be awesome if i can inner join two other tables instead of the table mentioned after FROM keyword
That sparked a memory from a random chapter in an Itzik Ben-Gan book I read about the use of parentheses in Joins to coerce the optimizer into which joins to evaluate first.
I couldn’t find a mention of it online but the closest I could get to was this link. (Warning, the links in this article are broken so you can’t see it in action)
However, now that it was in my head, I couldn’t help but show off 🙂
Asker: that’d be awesome if i can inner join two other tables instead of the table mentioned after FROM keyword
Me: …wait, what?
A: He’s asking
t1 left join t12
t1 left join t13
t12 inner join t13
M: em…it’s possible but it’s…iffy
A: i wanna learn it.
do your magic
As far as I can tell…
… it’s a hybrid of a FROM
clause and a Derived Table
where we self contain it.
Instead of giving the derived table an alias, we just create an ON
clause that links it back to our original FROM
table and it works.
This makes the optimizer evaluate inside the brackets first and then join that result back to Table1.
Funnily enough, although the statistics are the exact same as the derived table and CTE, the plan is different. This plan is actually simpler than the derived table/CTE which seems to need a “compute scalar” to extract the SELECT
statement.
Black Magic…
I’ve never seen it used in the wild and I’m not sure when you would use it. It seems to suggest if you need it that there’s something strange with your query. Still, it’s nice to know that it’s there.
Thanks Itzik, you made me seem smarter than I am. I appreciate that 😀
Wouldn’t we get the same results with an IF statement in the select statement?
SELECT
table1.col1 AS t1_col1,
IF(table2.col1 = table3.col1, table2.col1, NULL) AS t2_col1,
IF(table3.col1 = table2.col1, table3.col1, NULL) AS t3_col1
FROM
table1
LEFT JOIN table2 ON table2.col1 = table1.col1
LEFT JOIN table3 ON table3.col1 = table1.col1
Yup, that should work as well.
Think it introduces an extra Compute Scalar on the plan but that should only matter at scale.
Nice one!