Databases

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Does LEFT OUTER JOIN actually work in SQL Server 2000/2005

    8 answers - 2257 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    For a long time I've resisted using the ANSI syntax for joins because:
    1. They're VERY confusing when more than two tables are involved
    2. They never seem to work
    Now that SQL 2005 threatens the older, more logical syntax, I figured
    I'd give it another try, only to get the same results - It produces the
    wrong results.
    I have 2 tables: FACILITY_TYPES and CURSE_FACILITY_TYPES.
    FACILITY_TYPES is a master list of all types
    CURSE_FACILITY_TYPES is a list of types each course needs
    In this example, I want to list all the types, and indicate which are
    selected for a given course (in this case none). I run the following
    query and everything works as expected:
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT, CURSE_FACILITY_TYPES CF
    WHERE FT.FACILITY_TYPE_ID *= CF.FACILITY_TYPE_ID
    AND CF.CURSE_ID = 1
    - Returns 12 records, one for each facility type and a 0 in the
    SELECTED field - Perfect!
    Then I try to use the ANSI syntax (remember that CURSE_FACILITY_TYPES
    has no records for CURSE_ID = 1):
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN CURSE_FACILITY_TYPES CF N (
    FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID)
    WHERE CF.CURSE_ID = 1
    - Returns no records
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT RIGHT UTER JIN CURSE_FACILITY_TYPES CF N (
    FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID)
    WHERE CF.CURSE_ID = 1
    - Returns no records
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM CURSE_FACILITY_TYPES CF LEFT UTER JIN FACILITY_TYPES FT N
    ( CF.FACILITY_TYPE_ID = FT.FACILITY_TYPE_ID)
    WHERE CF.CURSE_ID = 1
    - Returns no records
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM CURSE_FACILITY_TYPES CF RIGHT UTER JIN FACILITY_TYPES FT N
    (CF.FACILITY_TYPE_ID = FT.FACILITY_TYPE_ID)
    WHERE CF.CURSE_ID = 1
    - Returns no records
    What am I missing?
    Thanks in advance.
  • No.1 | | 2173 bytes | |

    scubakiz wrote:

    For a long time I've resisted using the ANSI syntax for joins because:

    1. They're VERY confusing when more than two tables are involved
    2. They never seem to work

    Now that SQL 2005 threatens the older, more logical syntax, I figured
    I'd give it another try, only to get the same results - It produces the
    wrong results.

    I have 2 tables: FACILITY_TYPES and CURSE_FACILITY_TYPES.

    FACILITY_TYPES is a master list of all types
    CURSE_FACILITY_TYPES is a list of types each course needs

    In this example, I want to list all the types, and indicate which are
    selected for a given course (in this case none). I run the following
    query and everything works as expected:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT, CURSE_FACILITY_TYPES CF
    WHERE FT.FACILITY_TYPE_ID *= CF.FACILITY_TYPE_ID
    AND CF.CURSE_ID = 1

    - Returns 12 records, one for each facility type and a 0 in the
    SELECTED field - Perfect!

    Then I try to use the ANSI syntax (remember that CURSE_FACILITY_TYPES
    has no records for CURSE_ID = 1):

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN CURSE_FACILITY_TYPES CF N (
    FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID)
    WHERE CF.CURSE_ID = 1

    - Returns no records

    Your formulation is incorrect. For standard syntax, the FRM clause is conceptually
    executed before the WHERE clause. That is, the WHERE clause is testing the result of the
    outer join. This means that CURSE_ID is either NULL or <1; it is never 1. Thus, no
    rows are retrieved.

    You need to do the (CURSE_ID = 1) predicate before the outer join. I don't know if SQL
    Server supports queries nested in the FRM clause, but using standard SQL, you would do:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN
    (SELECT * FRM CURSE_FACILITY_TYPES WHERE CURSE_ID = 1) CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID)
  • No.2 | | 3943 bytes | |

    The old *= syntax (and other extended equality vendor syntax) is the
    one with all the flaws and lack of flexibility.

    Here is how UTER JINs work in SQL-92. Assume you are given:

    Table1 Table2
    a b a c

    1 w 1 r
    2 x 2 s
    3 y 3 t
    4 z

    and the outer join expression:

    Table1
    LEFT UTER JIN
    Table2
    N Table1.a = Table2.a <== join condition
    AND Table2.c = 't'; <== single table condition

    We call Table1 the "preserved table" and Table2 the "unpreserved table"
    in the query. What I am going to give you is a little different, but
    equivalent to the ANSI/IS standards.

    1) We build the CRSS JIN of the two tables. Scan each row in the
    result set.

    2) If the predicate tests TRUE for that row, then you keep it. You also
    remove all rows derived from it from the CRSS JIN

    3) If the predicate tests FALSE or UNKNWN for that row, then keep the
    columns from the preserved table, convert all the columns from the
    unpreserved table to NULLs and remove the duplicates.

    So let us execute this by hand:

    Let @ = passed the first predicate
    Let * = passed the second predicate

    Table1 CRSS JIN Table2
    a b a c

    1 w 1 r @
    1 w 2 s
    1 w 3 t *
    2 x 1 r
    2 x 2 s @
    2 x 3 t *
    3 y 1 r
    3 y 2 s
    3 y 3 t @* <== the TRUE set
    4 z 1 r
    4 z 2 s
    4 z 3 t *

    Table1 LEFT UTER JIN Table2
    a b a c

    3 y 3 t <= only TRUE row

    1 w NULL NULL Sets of duplicates
    1 w NULL NULL
    1 w NULL NULL

    2 x NULL NULL
    2 x NULL NULL
    2 x NULL NULL
    3 y NULL NULL <== derived from the TRUE set - Remove
    3 y NULL NULL

    4 z NULL NULL
    4 z NULL NULL
    4 z NULL NULL

    the final results:

    Table1 LEFT UTER JIN Table2
    a b a c

    1 w NULL NULL
    2 x NULL NULL
    3 y 3 t
    4 z NULL NULL

    The basic rule is that every row in the preserved table is represented
    in the results in at least one result row.

    There are limitations and very serious problems with the extended
    equality version of an outer join used in some diseased mutant
    products. Consider the two Chris Date tables

    Suppliers SupParts
    supno supno partno qty

    S1 S1 P1 100
    S2 S1 P2 250
    S3 S2 P1 100
    S2 P2 250

    and let's do an extended equality outer join like this:

    SELECT *
    FRM Supplier, SupParts
    WHERE Supplier.supno *= SupParts.supno
    AND qty < 200;

    If I do the outer first, I get:

    Suppliers LJ SupParts
    supno supno partno qty

    S1 S1 P1 100
    S1 S1 P2 250
    S2 S2 P1 100
    S2 S2 P2 250
    S3 NULL NULL NULL

    Then I apply the (qty < 200) predicate and get

    Suppliers LJ SupParts
    supno supno partno qty

    S1 S1 P1 100
    S2 S2 P1 100

    Doing it in the opposite order

    Suppliers LJ SupParts
    supno supno partno qty

    S1 S1 P1 100
    S2 S2 P1 100
    S3 NULL NULL NULL

    Sybase does it one way, does it the other and Centura (nee
    Gupta) lets you pick which one -- the worst of both non-standard
    worlds! In SQL-92, you have a choice and can force the order of
    execution. Either do the predicates after the join

    SELECT *
    FRM Supplier
    LEFT UTER JIN
    SupParts
    N Supplier.supno = SupParts.supno
    WHERE qty < 200;

    or do it in the joining:

    SELECT *
    FRM Supplier
    LEFT UTER JIN
    SupParts
    N Supplier.supno = SupParts.supno
    AND qty < 200;

    Another problem is that you cannot show the same table as preserved and
    unpreserved in the extended equality version, but it is easy in SQL-92.
    For example to find the students who have taken Math 101 and might
    have taken Math 102:

    SELECT C1.student, C1.math, C2.math
    FRM (SELECT * FRM Courses WHERE math = 101) AS C1
    LEFT UTER JIN
    (SELECT * FRM Courses WHERE math = 102) AS C2
    N C1.student = C2.student;

  • No.3 | | 1162 bytes | |

    "Lee Fesperman" <firstsql@ix.netcom.comwrote in message
    news:43AA6043.584D@ix.netcom.com
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN
    (SELECT * FRM CURSE_FACILITY_TYPES WHERE CURSE_ID = 1) CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID)

    This can also be written without using a subquery:

    SELECT FT.FACILITY_TYPE,
    CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1 END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN
    CURSE_FACILITY_TYPES CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID AND CF.CURSE_ID =
    1)

    That is, the join matches a row in CF if that row has a matching
    facility_type_id and also has course_id = 1. , NULLs are returned
    for fields in CF as though no row matched, but we still get to see all the
    rows from FT.

    FWIW, I find the SQL-92 UTER JIN syntax more complex than SQL-89 style
    too, but it does more, so it's natural for it to be more complex. And to
    me, it's more clear than proprietary outer join syntax with "(+)" or "*="
    operators.

    Regards,
    Bill K.

  • No.4 | | 1269 bytes | |

    Bill Karwin wrote:

    "Lee Fesperman" <firstsql@ix.netcom.comwrote in message
    news:43AA6043.584D@ix.netcom.com
    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN
    (SELECT * FRM CURSE_FACILITY_TYPES WHERE CURSE_ID = 1) CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID)

    This can also be written without using a subquery:

    SELECT FT.FACILITY_TYPE,
    CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1 END AS SELECTED
    FRM FACILITY_TYPES FT LEFT UTER JIN
    CURSE_FACILITY_TYPES CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID AND CF.CURSE_ID =
    1)

    Yes, right. For some reason, I had it stuck in my head that N clauses could only have
    'joining' predicates (comparisons between columns), but there is no such restriction in
    SQL92. I was thinking of the DBC extensions which do have that restriction.

    FWIW, I find the SQL-92 UTER JIN syntax more complex than SQL-89 style
    too, but it does more, so it's natural for it to be more complex. And to
    me, it's more clear than proprietary outer join syntax with "(+)" or "*="
    operators.

    I'm not sure what you mean here since SQL89 doesn't have an Join syntax.
  • No.5 | | 671 bytes | |

    "Lee Fesperman" <firstsql@ix.netcom.comwrote in message
    news:43AB1D3B.432E@ix.netcom.com
    Bill Karwin wrote:
    >FWIW, I find the SQL-92 UTER JIN syntax more complex than SQL-89 style
    >

    I'm not sure what you mean here since SQL89 doesn't have an Join
    syntax.

    You're absolutely right. I should have said that that I find the SQL-92
    JIN syntax (not just the outer joins) more complex than the SQL-89 style of
    join syntax (using WHERE clauses). But the complexity is justified by the
    additional useful functionality that the SQL-92 syntax provides.

    Regards,
    Bill K.

  • No.6 | | 1975 bytes | |

    Thank you all very much for your help. I think I see how this syntax
    can offer more flexibility. I already found a solution to a long
    standing problem that's been a challenge using the old syntax. For
    example, if I wanted to extend the UTER join and get supporting
    information from a thrid table if a join existed, I'd have to resort to
    creative sub-queries. This does not work:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED, C.CURSE_NAME, C.CURSE_NUMBER
    FRM FACILITY_TYPES FT, CURSE_FACILITY_TYPES CF, CURSES C
    WHERE FT.FACILITY_TYPE_ID *= CF.FACILITY_TYPE_ID
    AND CF.CURSE_ID = C.CURSE_ID
    AND C.CURSE_ID = 1

    In SQL 2005, it returns only a single row for course 1 (I added a
    record for course 1). I seem to remember in SQL 2000, I would just get
    an error about an inner join not being allowed inside an outer join.
    To get this to work, I'd have to write UGLY SQL like this:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED,
    (SELECT CURSE_NAME FRM CURSES WHERE CURSE_ID = CF.CURSE_ID)
    AS CURSE_NAME,
    (SELECT CURSE_NUMBER FRM CURSES WHERE CURSE_ID = CF.CURSE_ID)
    AS CURSE_NUMBER
    FRM FACILITY_TYPES FT, CURSE_FACILITY_TYPES CF
    WHERE FT.FACILITY_TYPE_ID *= CF.FACILITY_TYPE_ID
    AND CF.CURSE_ID = 1

    Not pleasant or effecient!

    However, using the examples you all provided (thank you again), I can
    easily extend them to create:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED,
    C.CURSE_NAME, C.CURSE_NUMBER
    FRM FACILITY_TYPES FT
    LEFT UTER JIN CURSE_FACILITY_TYPES CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID AND
    CF.CURSE_ID = 1)
    LEFT UTER JIN CURSES C
    N (CF.CURSE_ID = C.CURSE_ID)

    Which gives me NULLs when no match exists, but the course name and
    number when there is a match. This is great!

  • No.7 | | 4183 bytes | |

    scuba,
    As others have said, the ansi syntax is much more powerful. Be careful
    not to cripple the server by doing lots of left joins to 'get at'
    information. If coursefacilitytypes always has a matching record in
    Courses, then you could do your query like this:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED,
    C.CURSE_NAME, C.CURSE_NUMBER
    FRM FACILITY_TYPES FT
    LEFT JIN CURSE_FACILITY_TYPES CF
    JIN CURSES C
    N CF.CURSE_ID = C.CURSE_ID and
    cf.course_id=1
    N FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID

    I'm pretty certain the the coursid=1 could have stayed at the bottom or
    it can go where I put it. you unlock the power of doing a left
    join to a 'quantity', where the quantity is some joined set of tables
    with exact criteria, then you will be writing all kinds of queries
    that have stumped everyone for years.

    At my job, no one knew how to select employees and their spouses, but
    be sure to include employees with no spouse.

    people left join (people as familymembers join personalinfo on
    familymembers.A=personalinfo.B and
    personalinfo.relationship='spouse')
    on people.A=familymembers.B

    The A, B are just joining keys. If you don't put the 'spouse' stuff in
    an inner join, you end up getting all the kids as well, their
    relationship will be NULL, the spouse will have relationship 'spouse'.
    This can really screw up the sums and counts in any report you are
    making. You need to pinpoint the spouse and only the spouse's
    what ansi syntax lets you do with ease.

    At my job, my boss gave everyone SQL for Smarties by Celko (kickbacks
    appreciated, but not necessary), and I'd recommend it. I've written
    queries for all kinds of things in 20 minutes while coworkers are
    stumped and then spend hours and days writing sprocs and fighting with
    other parts of my company to get the sprocs loaded into customers
    systems, etc.

    I use Sql Server 2000 and long ago stopped typing 'inner' and 'outer'.
    Saves a wee bit of time.

    scubakiz wrote:
    Thank you all very much for your help. I think I see how this syntax
    can offer more flexibility. I already found a solution to a long
    standing problem that's been a challenge using the old syntax. For
    example, if I wanted to extend the UTER join and get supporting
    information from a thrid table if a join existed, I'd have to resort to
    creative sub-queries. This does not work:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED, C.CURSE_NAME, C.CURSE_NUMBER
    FRM FACILITY_TYPES FT, CURSE_FACILITY_TYPES CF, CURSES C
    WHERE FT.FACILITY_TYPE_ID *= CF.FACILITY_TYPE_ID
    AND CF.CURSE_ID = C.CURSE_ID
    AND C.CURSE_ID = 1

    In SQL 2005, it returns only a single row for course 1 (I added a
    record for course 1). I seem to remember in SQL 2000, I would just get
    an error about an inner join not being allowed inside an outer join.
    To get this to work, I'd have to write UGLY SQL like this:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED,
    (SELECT CURSE_NAME FRM CURSES WHERE CURSE_ID = CF.CURSE_ID)
    AS CURSE_NAME,
    (SELECT CURSE_NUMBER FRM CURSES WHERE CURSE_ID = CF.CURSE_ID)
    AS CURSE_NUMBER
    FRM FACILITY_TYPES FT, CURSE_FACILITY_TYPES CF
    WHERE FT.FACILITY_TYPE_ID *= CF.FACILITY_TYPE_ID
    AND CF.CURSE_ID = 1

    Not pleasant or effecient!

    However, using the examples you all provided (thank you again), I can
    easily extend them to create:

    SELECT FT.FACILITY_TYPE, CASE WHEN CF.CURSE_ID IS NULL THEN 0 ELSE 1
    END AS SELECTED,
    C.CURSE_NAME, C.CURSE_NUMBER
    FRM FACILITY_TYPES FT
    LEFT UTER JIN CURSE_FACILITY_TYPES CF
    N (FT.FACILITY_TYPE_ID = CF.FACILITY_TYPE_ID AND
    CF.CURSE_ID = 1)
    LEFT UTER JIN CURSES C
    N (CF.CURSE_ID = C.CURSE_ID)

    Which gives me NULLs when no match exists, but the course name and
    number when there is a match. This is great!

  • No.8 | | 834 bytes | |

    Lee Fesperman <firstsql@ix.netcom.comwrites:

    You need to do the (CURSE_ID = 1) predicate before the outer join. I don't know if SQL
    Server supports queries nested in the FRM clause, but using standard SQL, you would do:

    Yes, Microsoft SQL Server 2000 supports subqueries in the FRM clause.
    We use the SQL-92 syntax a lot at Eaton Vance where I work. Large
    queries are more readable because it's easier to sort out the filtering
    conditions from the join conditions.

    What 2000 doesn't have and hopefully 2005 has is JIN USING and tuple
    comparison. The lack of these features in 2000 makes compound primary
    keys tedious to use, and as a result a lot of our database designs do
    not use the best primary keys.

    of my employment I use PostgreSQL exclusively.

Re: Does LEFT OUTER JOIN actually work in SQL Server 2000/2005


max 4000 letters.
Your nickname that display:
In order to stop the spam: 0 + 9 =
QUESTION ON "Databases"

EMSDN.COM