SQL join format - nested inner joins

I have the following SQL statement in a legacy system I'm refactoring. It is an abbreviated view for the purposes of this question, just returning count(*) for the time being.

SELECT COUNT(*)
FROM Table1 INNER JOIN Table2 INNER JOIN Table3 ON Table2.Key = Table3.Key AND Table2.Key2 = Table3.Key2 ON Table1.DifferentKey = Table3.DifferentKey

It is generating a very large number of records and killing the system, but could someone please explain the syntax? And can this be expressed in any other way?

  • Table1 contains 419 rows
  • Table2 contains 3374 rows
  • Table3 contains 28182 rows

EDIT:

Suggested reformat

SELECT COUNT(*)
FROM Table1 INNER JOIN Table3 ON Table1.DifferentKey = Table3.DifferentKey INNER JOIN Table2 ON Table2.Key = Table3.Key AND Table2.Key2 = Table3.Key2
4

2 Answers

For readability, I restructured the query... starting with the apparent top-most level being Table1, which then ties to Table3, and then table3 ties to table2. Much easier to follow if you follow the chain of relationships.

Now, to answer your question. You are getting a large count as the result of a Cartesian product. For each record in Table1 that matches in Table3 you will have X * Y. Then, for each match between table3 and Table2 will have the same impact... Y * Z... So your result for just one possible ID in table 1 can have X * Y * Z records.

This is based on not knowing how the normalization or content is for your tables... if the key is a PRIMARY key or not..

Ex:
Table 1
DiffKey Other Val
1 X
1 Y
1 Z
Table 3
DiffKey Key Key2 Tbl3 Other
1 2 6 V
1 2 6 X
1 2 6 Y
1 2 6 Z
Table 2
Key Key2 Other Val
2 6 a
2 6 b
2 6 c
2 6 d
2 6 e

So, Table 1 joining to Table 3 will result (in this scenario) with 12 records (each in 1 joined with each in 3). Then, all that again times each matched record in table 2 (5 records)... total of 60 ( 3 tbl1 * 4 tbl3 * 5 tbl2 )count would be returned.

So, now, take that and expand based on your 1000's of records and you see how a messed-up structure could choke a cow (so-to-speak) and kill performance.

SELECT COUNT(*) FROM Table1 INNER JOIN Table3 ON Table1.DifferentKey = Table3.DifferentKey INNER JOIN Table2 ON Table3.Key =Table2.Key AND Table3.Key2 = Table2.Key2 
3

Since you've already received help on the query, I'll take a poke at your syntax question:

The first query employs some lesser-known ANSI SQL syntax which allows you to nest joins between the join and on clauses. This allows you to scope/tier your joins and probably opens up a host of other evil, arcane things.

Now, while a nested join cannot refer any higher in the join hierarchy than its immediate parent, joins above it or outside of its branch can refer to it... which is precisely what this ugly little guy is doing:

select count(*)
from Table1 as t1
join Table2 as t2 join Table3 as t3 on t2.Key = t3.Key -- join #1 and t2.Key2 = t3.Key2
on t1.DifferentKey = t3.DifferentKey -- join #2 

This looks a little confusing because join #2 is joining t1 to t2 without specifically referencing t2... however, it references t2 indirectly via t3 -as t3 is joined to t2 in join #1. While that may work, you may find the following a bit more (visually) linear and appealing:

select count(*)
from Table1 as t1 join Table3 as t3 join Table2 as t2 on t2.Key = t3.Key -- join #1 and t2.Key2 = t3.Key2 on t1.DifferentKey = t3.DifferentKey -- join #2

Personally, I've found that nesting in this fashion keeps my statements tidy by outlining each tier of the relationship hierarchy. As a side note, you don't need to specify inner. join is implicitly inner unless explicitly marked otherwise.

3

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like