Friday, March 30, 2012

MS SQLServer: Left/Right Join doesn't work when using aggregate functions

Hi.
I'm using MS SQLServer and I have noticed that when I use aggregate
functions on queries that use outer joins, the results varies depending
on the syntax that I use. If I use the syntax:
from table1 left join table2 on table1.a = table2.a
it doesn't seem to work properly, but if I use the syntax:
from table1, table2 where table1.a *= table2.a
it seems to work fine.
I would like to know if there's something that I'm doing wrong, if this
is some bug for which there is some solution, or if just the way it is.
In order to be more clear, I will show you some simple code that reflect
that.
Suppose we have the following 2 tables whith the corresponding data:
create table p1 ( id int primary key, s_name char(40) )
insert p1 values (1, 'Rick')
insert p1 values (2, 'Mary')
insert p1 values (3, 'Peter')
create table p2 (id int, region int, sales money, primary key (id,
region))
insert p2 values (1, 1, 2000)
insert p2 values (1, 2, 1000)
insert p2 values (1, 3, 1500)
insert p2 values (3, 1, 750)
insert p2 values (3, 2, 1000)
If I use a query without aggregate functions, both syntax return the
same result:
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
select p1.s_name, p2.sales
from p1, p2
where p1.id *= p2.id
Both of them return:
s_namesales
Rick2000.00
Rick1000.00
Rick1500.00
MaryNull
Peter750.00
Peter1000.00
But if I use Sum(), the results are different:
select p1.s_name, sum(p2.sales) as sales
from p1 left outer join p2 on p1.id = p2.id
group by p1.s_name
s_namesales
Rick4500.00
Peter1750.00
select p1.s_name, sum(p2.sales) as sales
from p1, p2
where p1.id *= p2.id
group by p1.s_name
s_namesales
Rick4500.00
MaryNull
Peter1750.00
In my opinion, the correct result is the last one, but the syntax that
Microsoft recommends to use is the other one.
Can someone explain that to me?
Thanks in advance.
Donald
*** Sent via Developersdex http://www.codecomments.com ***
You can get the result you want with:
SELECT s_name, SUM(sales) as sales FROM (
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
) a
group by s_name
-- or
SELECT p1.s_name, a.sales
FROM p1
LEFT OUTER JOIN (
SELECT id, SUM(sales) AS sales
FROM p2
GROUP BY id) a
ON p1.id = a.id
-- or the simplest (but somewhat worse performing on large sets), GROUP BY
ALL:
select p1.s_name, SUM(p2.sales) AS sales
from p1 left outer join p2 on p1.id = p2.id
GROUP BY ALL p1.s_name
Joe Celko give a rundown of the disadvantages of *= here:
http://groups.google.co.uk/groups?hl...phx.gbl&rnum=7
Jacco Schalkwijk
SQL Server MVP
"Donald" <don382w@.hotmail.com> wrote in message
news:%23$ZyTiKSFHA.2132@.TK2MSFTNGP14.phx.gbl...
> Hi.
> I'm using MS SQLServer and I have noticed that when I use aggregate
> functions on queries that use outer joins, the results varies depending
> on the syntax that I use. If I use the syntax:
> from table1 left join table2 on table1.a = table2.a
> it doesn't seem to work properly, but if I use the syntax:
> from table1, table2 where table1.a *= table2.a
> it seems to work fine.
> I would like to know if there's something that I'm doing wrong, if this
> is some bug for which there is some solution, or if just the way it is.
> In order to be more clear, I will show you some simple code that reflect
> that.
> Suppose we have the following 2 tables whith the corresponding data:
> create table p1 ( id int primary key, s_name char(40) )
> insert p1 values (1, 'Rick')
> insert p1 values (2, 'Mary')
> insert p1 values (3, 'Peter')
> create table p2 (id int, region int, sales money, primary key (id,
> region))
> insert p2 values (1, 1, 2000)
> insert p2 values (1, 2, 1000)
> insert p2 values (1, 3, 1500)
> insert p2 values (3, 1, 750)
> insert p2 values (3, 2, 1000)
> If I use a query without aggregate functions, both syntax return the
> same result:
> select p1.s_name, p2.sales
> from p1 left outer join p2 on p1.id = p2.id
> select p1.s_name, p2.sales
> from p1, p2
> where p1.id *= p2.id
> Both of them return:
> s_name sales
> -- --
> Rick 2000.00
> Rick 1000.00
> Rick 1500.00
> Mary Null
> Peter 750.00
> Peter 1000.00
> But if I use Sum(), the results are different:
> select p1.s_name, sum(p2.sales) as sales
> from p1 left outer join p2 on p1.id = p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Peter 1750.00
> select p1.s_name, sum(p2.sales) as sales
> from p1, p2
> where p1.id *= p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Mary Null
> Peter 1750.00
> In my opinion, the correct result is the last one, but the syntax that
> Microsoft recommends to use is the other one.
> Can someone explain that to me?
> Thanks in advance.
> Donald
>
> *** Sent via Developersdex http://www.codecomments.com ***
|||Yes, I am running SQL-Server 2000. I will check which service pack, if
any, is installed. I already downloaded the service pack 3a to install.
Thanks for everything,
Donald
*** Sent via Developersdex http://www.codecomments.com ***

No comments:

Post a Comment