Showing posts with label queries. Show all posts
Showing posts with label queries. Show all posts

Friday, March 30, 2012

MS SQLServer: Left/Right Join doesn't work when using aggregate functions

Hi.
I'm using MS SQLServer and I have noticed that when I use aggregate
functions on queries that use outer joins, the results varies depending
on the syntax that I use. If I use the syntax:
from table1 left join table2 on table1.a = table2.a
it doesn't seem to work properly, but if I use the syntax:
from table1, table2 where table1.a *= table2.a
it seems to work fine.
I would like to know if there's something that I'm doing wrong, if this
is some bug for which there is some solution, or if just the way it is.
In order to be more clear, I will show you some simple code that reflect
that.
Suppose we have the following 2 tables whith the corresponding data:
create table p1 ( id int primary key, s_name char(40) )
insert p1 values (1, 'Rick')
insert p1 values (2, 'Mary')
insert p1 values (3, 'Peter')
create table p2 (id int, region int, sales money, primary key (id,
region))
insert p2 values (1, 1, 2000)
insert p2 values (1, 2, 1000)
insert p2 values (1, 3, 1500)
insert p2 values (3, 1, 750)
insert p2 values (3, 2, 1000)
If I use a query without aggregate functions, both syntax return the
same result:
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
select p1.s_name, p2.sales
from p1, p2
where p1.id *= p2.id
Both of them return:
s_namesales
Rick2000.00
Rick1000.00
Rick1500.00
MaryNull
Peter750.00
Peter1000.00
But if I use Sum(), the results are different:
select p1.s_name, sum(p2.sales) as sales
from p1 left outer join p2 on p1.id = p2.id
group by p1.s_name
s_namesales
Rick4500.00
Peter1750.00
select p1.s_name, sum(p2.sales) as sales
from p1, p2
where p1.id *= p2.id
group by p1.s_name
s_namesales
Rick4500.00
MaryNull
Peter1750.00
In my opinion, the correct result is the last one, but the syntax that
Microsoft recommends to use is the other one.
Can someone explain that to me?
Thanks in advance.
Donald
*** Sent via Developersdex http://www.codecomments.com ***
You can get the result you want with:
SELECT s_name, SUM(sales) as sales FROM (
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
) a
group by s_name
-- or
SELECT p1.s_name, a.sales
FROM p1
LEFT OUTER JOIN (
SELECT id, SUM(sales) AS sales
FROM p2
GROUP BY id) a
ON p1.id = a.id
-- or the simplest (but somewhat worse performing on large sets), GROUP BY
ALL:
select p1.s_name, SUM(p2.sales) AS sales
from p1 left outer join p2 on p1.id = p2.id
GROUP BY ALL p1.s_name
Joe Celko give a rundown of the disadvantages of *= here:
http://groups.google.co.uk/groups?hl...phx.gbl&rnum=7
Jacco Schalkwijk
SQL Server MVP
"Donald" <don382w@.hotmail.com> wrote in message
news:%23$ZyTiKSFHA.2132@.TK2MSFTNGP14.phx.gbl...
> Hi.
> I'm using MS SQLServer and I have noticed that when I use aggregate
> functions on queries that use outer joins, the results varies depending
> on the syntax that I use. If I use the syntax:
> from table1 left join table2 on table1.a = table2.a
> it doesn't seem to work properly, but if I use the syntax:
> from table1, table2 where table1.a *= table2.a
> it seems to work fine.
> I would like to know if there's something that I'm doing wrong, if this
> is some bug for which there is some solution, or if just the way it is.
> In order to be more clear, I will show you some simple code that reflect
> that.
> Suppose we have the following 2 tables whith the corresponding data:
> create table p1 ( id int primary key, s_name char(40) )
> insert p1 values (1, 'Rick')
> insert p1 values (2, 'Mary')
> insert p1 values (3, 'Peter')
> create table p2 (id int, region int, sales money, primary key (id,
> region))
> insert p2 values (1, 1, 2000)
> insert p2 values (1, 2, 1000)
> insert p2 values (1, 3, 1500)
> insert p2 values (3, 1, 750)
> insert p2 values (3, 2, 1000)
> If I use a query without aggregate functions, both syntax return the
> same result:
> select p1.s_name, p2.sales
> from p1 left outer join p2 on p1.id = p2.id
> select p1.s_name, p2.sales
> from p1, p2
> where p1.id *= p2.id
> Both of them return:
> s_name sales
> -- --
> Rick 2000.00
> Rick 1000.00
> Rick 1500.00
> Mary Null
> Peter 750.00
> Peter 1000.00
> But if I use Sum(), the results are different:
> select p1.s_name, sum(p2.sales) as sales
> from p1 left outer join p2 on p1.id = p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Peter 1750.00
> select p1.s_name, sum(p2.sales) as sales
> from p1, p2
> where p1.id *= p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Mary Null
> Peter 1750.00
> In my opinion, the correct result is the last one, but the syntax that
> Microsoft recommends to use is the other one.
> Can someone explain that to me?
> Thanks in advance.
> Donald
>
> *** Sent via Developersdex http://www.codecomments.com ***
|||Yes, I am running SQL-Server 2000. I will check which service pack, if
any, is installed. I already downloaded the service pack 3a to install.
Thanks for everything,
Donald
*** Sent via Developersdex http://www.codecomments.com ***

MS SQLServer: Left/Right Join doesn't work when using aggregate functions

Hi.
I'm using MS SQLServer and I have noticed that when I use aggregate
functions on queries that use outer joins, the results varies depending
on the syntax that I use. If I use the syntax:
from table1 left join table2 on table1.a = table2.a
it doesn't seem to work properly, but if I use the syntax:
from table1, table2 where table1.a *= table2.a
it seems to work fine.
I would like to know if there's something that I'm doing wrong, if this
is some bug for which there is some solution, or if just the way it is.
In order to be more clear, I will show you some simple code that reflect
that.
Suppose we have the following 2 tables whith the corresponding data:
create table p1 ( id int primary key, s_name char(40) )
insert p1 values (1, 'Rick')
insert p1 values (2, 'Mary')
insert p1 values (3, 'Peter')
create table p2 (id int, region int, sales money, primary key (id,
region))
insert p2 values (1, 1, 2000)
insert p2 values (1, 2, 1000)
insert p2 values (1, 3, 1500)
insert p2 values (3, 1, 750)
insert p2 values (3, 2, 1000)
If I use a query without aggregate functions, both syntax return the
same result:
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
select p1.s_name, p2.sales
from p1, p2
where p1.id *= p2.id
Both of them return:
s_name sales
-- --
Rick 2000.00
Rick 1000.00
Rick 1500.00
Mary Null
Peter 750.00
Peter 1000.00
But if I use Sum(), the results are different:
select p1.s_name, sum(p2.sales) as sales
from p1 left outer join p2 on p1.id = p2.id
group by p1.s_name
s_name sales
-- --
Rick 4500.00
Peter 1750.00
select p1.s_name, sum(p2.sales) as sales
from p1, p2
where p1.id *= p2.id
group by p1.s_name
s_name sales
-- --
Rick 4500.00
Mary Null
Peter 1750.00
In my opinion, the correct result is the last one, but the syntax that
Microsoft recommends to use is the other one.
Can someone explain that to me?
Thanks in advance.
Donald
*** Sent via Developersdex http://www.developersdex.com ***You can get the result you want with:
SELECT s_name, SUM(sales) as sales FROM (
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
) a
group by s_name
-- or
SELECT p1.s_name, a.sales
FROM p1
LEFT OUTER JOIN (
SELECT id, SUM(sales) AS sales
FROM p2
GROUP BY id) a
ON p1.id = a.id
-- or the simplest (but somewhat worse performing on large sets), GROUP BY
ALL:
select p1.s_name, SUM(p2.sales) AS sales
from p1 left outer join p2 on p1.id = p2.id
GROUP BY ALL p1.s_name
Joe Celko give a rundown of the disadvantages of *= here:
http://groups.google.co.uk/groups?hl=en&lr=&selm=%23Uogl7zREHA.628%40TK2MSFTNGP11.phx.gbl&rnum=7
--
Jacco Schalkwijk
SQL Server MVP
"Donald" <don382w@.hotmail.com> wrote in message
news:%23$ZyTiKSFHA.2132@.TK2MSFTNGP14.phx.gbl...
> Hi.
> I'm using MS SQLServer and I have noticed that when I use aggregate
> functions on queries that use outer joins, the results varies depending
> on the syntax that I use. If I use the syntax:
> from table1 left join table2 on table1.a = table2.a
> it doesn't seem to work properly, but if I use the syntax:
> from table1, table2 where table1.a *= table2.a
> it seems to work fine.
> I would like to know if there's something that I'm doing wrong, if this
> is some bug for which there is some solution, or if just the way it is.
> In order to be more clear, I will show you some simple code that reflect
> that.
> Suppose we have the following 2 tables whith the corresponding data:
> create table p1 ( id int primary key, s_name char(40) )
> insert p1 values (1, 'Rick')
> insert p1 values (2, 'Mary')
> insert p1 values (3, 'Peter')
> create table p2 (id int, region int, sales money, primary key (id,
> region))
> insert p2 values (1, 1, 2000)
> insert p2 values (1, 2, 1000)
> insert p2 values (1, 3, 1500)
> insert p2 values (3, 1, 750)
> insert p2 values (3, 2, 1000)
> If I use a query without aggregate functions, both syntax return the
> same result:
> select p1.s_name, p2.sales
> from p1 left outer join p2 on p1.id = p2.id
> select p1.s_name, p2.sales
> from p1, p2
> where p1.id *= p2.id
> Both of them return:
> s_name sales
> -- --
> Rick 2000.00
> Rick 1000.00
> Rick 1500.00
> Mary Null
> Peter 750.00
> Peter 1000.00
> But if I use Sum(), the results are different:
> select p1.s_name, sum(p2.sales) as sales
> from p1 left outer join p2 on p1.id = p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Peter 1750.00
> select p1.s_name, sum(p2.sales) as sales
> from p1, p2
> where p1.id *= p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Mary Null
> Peter 1750.00
> In my opinion, the correct result is the last one, but the syntax that
> Microsoft recommends to use is the other one.
> Can someone explain that to me?
> Thanks in advance.
> Donald
>
> *** Sent via Developersdex http://www.developersdex.com ***|||Donald,
Are you running SQL-Server 2000? Make sure you have the latest service
pack installed (SP3a).
If I run the query
select p1.s_name, sum(p2.sales) as sales
from p1 left outer join p2 on p1.id = p2.id
group by p1.s_name
it correctly returns
s_name sales
--- --
Mary NULL
Peter 1750.0000
Rick 4500.0000
(3 row(s) affected)
Warning: Null value is eliminated by an aggregate or other SET
operation.
HTH,
Gert-Jan
Donald wrote:
> Hi.
> I'm using MS SQLServer and I have noticed that when I use aggregate
> functions on queries that use outer joins, the results varies depending
> on the syntax that I use. If I use the syntax:
> from table1 left join table2 on table1.a = table2.a
> it doesn't seem to work properly, but if I use the syntax:
> from table1, table2 where table1.a *= table2.a
> it seems to work fine.
> I would like to know if there's something that I'm doing wrong, if this
> is some bug for which there is some solution, or if just the way it is.
> In order to be more clear, I will show you some simple code that reflect
> that.
> Suppose we have the following 2 tables whith the corresponding data:
> create table p1 ( id int primary key, s_name char(40) )
> insert p1 values (1, 'Rick')
> insert p1 values (2, 'Mary')
> insert p1 values (3, 'Peter')
> create table p2 (id int, region int, sales money, primary key (id,
> region))
> insert p2 values (1, 1, 2000)
> insert p2 values (1, 2, 1000)
> insert p2 values (1, 3, 1500)
> insert p2 values (3, 1, 750)
> insert p2 values (3, 2, 1000)
> If I use a query without aggregate functions, both syntax return the
> same result:
> select p1.s_name, p2.sales
> from p1 left outer join p2 on p1.id = p2.id
> select p1.s_name, p2.sales
> from p1, p2
> where p1.id *= p2.id
> Both of them return:
> s_name sales
> -- --
> Rick 2000.00
> Rick 1000.00
> Rick 1500.00
> Mary Null
> Peter 750.00
> Peter 1000.00
> But if I use Sum(), the results are different:
> select p1.s_name, sum(p2.sales) as sales
> from p1 left outer join p2 on p1.id = p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Peter 1750.00
> select p1.s_name, sum(p2.sales) as sales
> from p1, p2
> where p1.id *= p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Mary Null
> Peter 1750.00
> In my opinion, the correct result is the last one, but the syntax that
> Microsoft recommends to use is the other one.
> Can someone explain that to me?
> Thanks in advance.
> Donald
> *** Sent via Developersdex http://www.developersdex.com ***

MS SQLServer: Left/Right Join doesn't work when using aggregate functions

Hi.
I'm using MS SQLServer and I have noticed that when I use aggregate
functions on queries that use outer joins, the results varies depending
on the syntax that I use. If I use the syntax:
from table1 left join table2 on table1.a = table2.a
it doesn't seem to work properly, but if I use the syntax:
from table1, table2 where table1.a *= table2.a
it seems to work fine.
I would like to know if there's something that I'm doing wrong, if this
is some bug for which there is some solution, or if just the way it is.
In order to be more clear, I will show you some simple code that reflect
that.
Suppose we have the following 2 tables whith the corresponding data:
create table p1 ( id int primary key, s_name char(40) )
insert p1 values (1, 'Rick')
insert p1 values (2, 'Mary')
insert p1 values (3, 'Peter')
create table p2 (id int, region int, sales money, primary key (id,
region))
insert p2 values (1, 1, 2000)
insert p2 values (1, 2, 1000)
insert p2 values (1, 3, 1500)
insert p2 values (3, 1, 750)
insert p2 values (3, 2, 1000)
If I use a query without aggregate functions, both syntax return the
same result:
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
select p1.s_name, p2.sales
from p1, p2
where p1.id *= p2.id
Both of them return:
s_name sales
-- --
Rick 2000.00
Rick 1000.00
Rick 1500.00
Mary Null
Peter 750.00
Peter 1000.00
But if I use Sum(), the results are different:
select p1.s_name, sum(p2.sales) as sales
from p1 left outer join p2 on p1.id = p2.id
group by p1.s_name
s_name sales
-- --
Rick 4500.00
Peter 1750.00
select p1.s_name, sum(p2.sales) as sales
from p1, p2
where p1.id *= p2.id
group by p1.s_name
s_name sales
-- --
Rick 4500.00
Mary Null
Peter 1750.00
In my opinion, the correct result is the last one, but the syntax that
Microsoft recommends to use is the other one.
Can someone explain that to me?
Thanks in advance.
Donald
*** Sent via Developersdex http://www.codecomments.com ***You can get the result you want with:
SELECT s_name, SUM(sales) as sales FROM (
select p1.s_name, p2.sales
from p1 left outer join p2 on p1.id = p2.id
) a
group by s_name
-- or
SELECT p1.s_name, a.sales
FROM p1
LEFT OUTER JOIN (
SELECT id, SUM(sales) AS sales
FROM p2
GROUP BY id) a
ON p1.id = a.id
-- or the simplest (but somewhat worse performing on large sets), GROUP BY
ALL:
select p1.s_name, SUM(p2.sales) AS sales
from p1 left outer join p2 on p1.id = p2.id
GROUP BY ALL p1.s_name
Joe Celko give a rundown of the disadvantages of *= here:
phx.gbl&rnum=7" target="_blank">http://groups.google.co.uk/groups?h...
phx.gbl&rnum=7
Jacco Schalkwijk
SQL Server MVP
"Donald" <don382w@.hotmail.com> wrote in message
news:%23$ZyTiKSFHA.2132@.TK2MSFTNGP14.phx.gbl...
> Hi.
> I'm using MS SQLServer and I have noticed that when I use aggregate
> functions on queries that use outer joins, the results varies depending
> on the syntax that I use. If I use the syntax:
> from table1 left join table2 on table1.a = table2.a
> it doesn't seem to work properly, but if I use the syntax:
> from table1, table2 where table1.a *= table2.a
> it seems to work fine.
> I would like to know if there's something that I'm doing wrong, if this
> is some bug for which there is some solution, or if just the way it is.
> In order to be more clear, I will show you some simple code that reflect
> that.
> Suppose we have the following 2 tables whith the corresponding data:
> create table p1 ( id int primary key, s_name char(40) )
> insert p1 values (1, 'Rick')
> insert p1 values (2, 'Mary')
> insert p1 values (3, 'Peter')
> create table p2 (id int, region int, sales money, primary key (id,
> region))
> insert p2 values (1, 1, 2000)
> insert p2 values (1, 2, 1000)
> insert p2 values (1, 3, 1500)
> insert p2 values (3, 1, 750)
> insert p2 values (3, 2, 1000)
> If I use a query without aggregate functions, both syntax return the
> same result:
> select p1.s_name, p2.sales
> from p1 left outer join p2 on p1.id = p2.id
> select p1.s_name, p2.sales
> from p1, p2
> where p1.id *= p2.id
> Both of them return:
> s_name sales
> -- --
> Rick 2000.00
> Rick 1000.00
> Rick 1500.00
> Mary Null
> Peter 750.00
> Peter 1000.00
> But if I use Sum(), the results are different:
> select p1.s_name, sum(p2.sales) as sales
> from p1 left outer join p2 on p1.id = p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Peter 1750.00
> select p1.s_name, sum(p2.sales) as sales
> from p1, p2
> where p1.id *= p2.id
> group by p1.s_name
> s_name sales
> -- --
> Rick 4500.00
> Mary Null
> Peter 1750.00
> In my opinion, the correct result is the last one, but the syntax that
> Microsoft recommends to use is the other one.
> Can someone explain that to me?
> Thanks in advance.
> Donald
>
> *** Sent via Developersdex http://www.codecomments.com ***|||Yes, I am running SQL-Server 2000. I will check which service pack, if
any, is installed. I already downloaded the service pack 3a to install.
Thanks for everything,
Donald
*** Sent via Developersdex http://www.codecomments.com ***

Monday, March 26, 2012

ms sql server client

probably a very easy question to answer:
which client component do i have to use in order to send queries to a remote sql server and receive the resultset, how can i start this utility?
Hi,
See linked servers in books online. Using Linked servers you could connect
to the remote sql server standing from local sql server.
Only thing is you have to use four part names to refer the table.
Linkedserver.dbname.tableowner.table_name
In otherwise you just want to query a remote server then:-
1. Create a alias server using CLeint network utility
2. Mention TCP IP as protocol , Ip address and port number
3. Using this alis name you could connect to to remote server and query the
remote server using query analyzer or OSQL utility
Thanks
Hari
MCDBA
"cajota" <cajota@.discussions.microsoft.com> wrote in message
news:E9DF0A05-3225-4B3E-BEC3-E52AFD618FE5@.microsoft.com...
> probably a very easy question to answer:
> which client component do i have to use in order to send queries to a
remote sql server and receive the resultset, how can i start this utility?
|||If you are trying to connect to one SQL Server but request information from
a different SQL Server, then Hari's answer is what you should do..
If you are asking what software must be installed on any client to connect
to a server... then
For connection using YOUR software, the minimum in Using the SQL Server CD,
install the CLIENT CONNECTIVITY piece.
If , additionally you want tools like query analyzer on your desktop,
install the tools options as well... Most programmers also install Books On
Line as well.
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"cajota" <cajota@.discussions.microsoft.com> wrote in message
news:E9DF0A05-3225-4B3E-BEC3-E52AFD618FE5@.microsoft.com...
> probably a very easy question to answer:
> which client component do i have to use in order to send queries to a
remote sql server and receive the resultset, how can i start this utility?

ms sql server client

probably a very easy question to answer:
which client component do i have to use in order to send queries to a remote sql server and receive the resultset, how can i start this utility?Hi,
See linked servers in books online. Using Linked servers you could connect
to the remote sql server standing from local sql server.
Only thing is you have to use four part names to refer the table.
Linkedserver.dbname.tableowner.table_name
In otherwise you just want to query a remote server then:-
1. Create a alias server using CLeint network utility
2. Mention TCP IP as protocol , Ip address and port number
3. Using this alis name you could connect to to remote server and query the
remote server using query analyzer or OSQL utility
Thanks
Hari
MCDBA
"cajota" <cajota@.discussions.microsoft.com> wrote in message
news:E9DF0A05-3225-4B3E-BEC3-E52AFD618FE5@.microsoft.com...
> probably a very easy question to answer:
> which client component do i have to use in order to send queries to a
remote sql server and receive the resultset, how can i start this utility?|||If you are trying to connect to one SQL Server but request information from
a different SQL Server, then Hari's answer is what you should do..
If you are asking what software must be installed on any client to connect
to a server... then
For connection using YOUR software, the minimum in Using the SQL Server CD,
install the CLIENT CONNECTIVITY piece.
If , additionally you want tools like query analyzer on your desktop,
install the tools options as well... Most programmers also install Books On
Line as well.
--
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"cajota" <cajota@.discussions.microsoft.com> wrote in message
news:E9DF0A05-3225-4B3E-BEC3-E52AFD618FE5@.microsoft.com...
> probably a very easy question to answer:
> which client component do i have to use in order to send queries to a
remote sql server and receive the resultset, how can i start this utility?

ms sql server client

probably a very easy question to answer:
which client component do i have to use in order to send queries to a remote
sql server and receive the resultset, how can i start this utility?Hi,
See linked servers in books online. Using Linked servers you could connect
to the remote sql server standing from local sql server.
Only thing is you have to use four part names to refer the table.
Linkedserver.dbname.tableowner.table_name
In otherwise you just want to query a remote server then:-
1. Create a alias server using CLeint network utility
2. Mention TCP IP as protocol , Ip address and port number
3. Using this alis name you could connect to to remote server and query the
remote server using query analyzer or OSQL utility
Thanks
Hari
MCDBA
"cajota" <cajota@.discussions.microsoft.com> wrote in message
news:E9DF0A05-3225-4B3E-BEC3-E52AFD618FE5@.microsoft.com...
> probably a very easy question to answer:
> which client component do i have to use in order to send queries to a
remote sql server and receive the resultset, how can i start this utility?|||If you are trying to connect to one SQL Server but request information from
a different SQL Server, then Hari's answer is what you should do..
If you are asking what software must be installed on any client to connect
to a server... then
For connection using YOUR software, the minimum in Using the SQL Server CD,
install the CLIENT CONNECTIVITY piece.
If , additionally you want tools like query analyzer on your desktop,
install the tools options as well... Most programmers also install Books On
Line as well.
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"cajota" <cajota@.discussions.microsoft.com> wrote in message
news:E9DF0A05-3225-4B3E-BEC3-E52AFD618FE5@.microsoft.com...
> probably a very easy question to answer:
> which client component do i have to use in order to send queries to a
remote sql server and receive the resultset, how can i start this utility?

Friday, March 9, 2012

MS SQL 7 and 2000

Does anyone know of any differences between queries in SQL 7 and SQL
2000? I have a collegue that says there could be "subtle" differences
in the output, though I cannot find any documentation of this. Any
link to prove him wrong would be very helpful. Thanks.kingk436 (development@.circuitmap.com) writes:
> Does anyone know of any differences between queries in SQL 7 and SQL
> 2000? I have a collegue that says there could be "subtle" differences
> in the output, though I cannot find any documentation of this. Any
> link to prove him wrong would be very helpful. Thanks.

One "subtle" difference is the rule for auto-conversion of data types.
In SQL 2000, conversion is done from a type hierarchy, whereas SQL 7
tries to first convert a constant.

This can differences if you have things like:

SELECT * FROM tbl where varcharcol > 1

In SQL 7, the 1 will be converted to a string, where as in SQL 2000
varcharcol will be converted to an integer. (And thus the query will
fail if there are values in the column that cannot be converted to
integer.)

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||"Erland Sommarskog" <esquel@.sommarskog.se> wrote in message
news:Xns9684F3D9BB08Yazorman@.127.0.0.1...
> kingk436 (development@.circuitmap.com) writes:
> > Does anyone know of any differences between queries in SQL 7 and SQL
> > 2000? I have a collegue that says there could be "subtle" differences
> > in the output, though I cannot find any documentation of this. Any
> > link to prove him wrong would be very helpful. Thanks.
> One "subtle" difference is the rule for auto-conversion of data types.
> In SQL 2000, conversion is done from a type hierarchy, whereas SQL 7
> tries to first convert a constant.
> This can differences if you have things like:
> SELECT * FROM tbl where varcharcol > 1
> In SQL 7, the 1 will be converted to a string, where as in SQL 2000
> varcharcol will be converted to an integer. (And thus the query will
> fail if there are values in the column that cannot be converted to
> integer.)

In addition I believe that without an order by, results returned will be
different. (Of course the order shouldn't be specified, but I believe both
SQL 7.0 and 2000 do default to some underlying physical order, but each
differently from the other.)

>
> --
> Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp|||Greg D. Moore (Strider) (mooregr_deleteth1s@.greenms.com) writes:
> "Erland Sommarskog" <esquel@.sommarskog.se> wrote in message
> news:Xns9684F3D9BB08Yazorman@.127.0.0.1...
>> In SQL 7, the 1 will be converted to a string, where as in SQL 2000
>> varcharcol will be converted to an integer. (And thus the query will
>> fail if there are values in the column that cannot be converted to
>> integer.)
> In addition I believe that without an order by, results returned will be
> different. (Of course the order shouldn't be specified, but I believe both
> SQL 7.0 and 2000 do default to some underlying physical order, but each
> differently from the other.)

Well, without any ORDER BY you could get different results from the
same server running SQL 7 or later depending on the phase of the moon,
availble CPUs if there are parallel streams, etc.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Wednesday, March 7, 2012

MS SQL 2005 performance issues

After migrating to 2005, I started noticing some performance problems with my queries. Here is one of them. The query takes some 40 seconds to execute for the first time with intense disk operation. Next time I run it the result comes back in less than a second. however, if I make a minor change to one of the parameter, it takes ~25 seconds with 100% CPU utilization and 0% of disk utilization. So, I'm assuming that all data has been cached, but I can't imagine why it takes so long to retrieve data from the memory?

BTW execution plans look completely identical for those queries.

Here is the query example:

select quotes.quote_id from indicators indicators_1, quotes, quotes quotes_1, quotes quotes_2, ranges
ranges_1 where not (1. * quotes_1.volume / quotes_2.volume < 3.54 and
indicators_1.RngStdDev25 / ranges_1.RngSMA25 < 1.2 ) and
quotes.quote_id - 1 = indicators_1.quote_id and quotes.quote_id - 1 = quotes_1.quote_id and
quotes.quote_id - 2 = quotes_2.quote_id and quotes.quote_id - 1 = ranges_1.quote_id and quotes.quote_id in
(select quote_id from trades where max_period = 2 and profit_target = 0.1 and strategy_id = 228)

And the parameter I'm changing is in red.

I will really appreciate if someone can help as I've started thinking about moving back to 2000 where no such problems existed.

Andy

I figured out that what actually takes so long is compiling the query. In some cases adding "option (fast 1)" helps, but in others doesn't.
Hell, I really have no idea why compiling the query but not retrieving the actual data is taking so much time.

|||

some more food for thoughts... the query below takes ~10 seconds to compile
all tables have primary key on quote_id column:

select quotes.quote_id
from quotes inner join indicators indicators_1 on quotes.quote_id = indicators_1.quote_id
inner join ranges ranges_1 on quotes.quote_id = ranges_1.quote_id
inner join prices prices_1 on quotes.quote_id = prices_1.quote_id
where quotes.quote_id in (select quote_id from trades where max_period = 2 and profit_target = 0.1 and strategy_id = 228)
option (recompile)

|||

So, now here's something that anyone can reproduce. Seems to be a bug (or feature) in SQL Server 2005.

The following script creates the tables and populates them with data:

CREATE TABLE [dbo].[test1](
[quote_id] [int] NOT NULL,
[ClSMA3] [real] NULL,
[ClSMA5] [real] NULL,
[ClSMA10] [real] NULL,
[ClSMA25] [real] NULL,
[ClSMA50] [real] NULL,
[ClMin3] [real] NULL,
[ClMin5] [real] NULL,
[ClMin10] [real] NULL,
[ClMin25] [real] NULL,
[ClMin50] [real] NULL,
[ClMax3] [real] NULL,
[ClMax5] [real] NULL,
[ClMax10] [real] NULL,
[ClMax25] [real] NULL,
[ClMax50] [real] NULL,
[LoMin3] [real] NULL,
[LoMin5] [real] NULL,
[LoMin10] [real] NULL,
[LoMin25] [real] NULL,
[LoMin50] [real] NULL,
[HiMax3] [real] NULL,
[HiMax5] [real] NULL,
[HiMax10] [real] NULL,
[HiMax25] [real] NULL,
[HiMax50] [real] NULL,
[LoMinBar3] [int] NULL,
[LoMinBar5] [int] NULL,
[LoMinBar10] [int] NULL,
[LoMinBar25] [int] NULL,
[LoMinBar50] [int] NULL,
[HiMaxBar3] [int] NULL,
[HiMaxBar5] [int] NULL,
[HiMaxBar10] [int] NULL,
[HiMaxBar25] [int] NULL,
[HiMaxBar50] [int] NULL
)

CREATE CLUSTERED INDEX [pk_test1] ON [dbo].[test1]
(
[quote_id] ASC
)

CREATE TABLE [dbo].[test2](
[quote_id] [int] NOT NULL,
[max_period] [smallint] NOT NULL,
[profit_target] [real] NOT NULL,
[trade_action] [smallint] NULL,
[open_date] [smalldatetime] NULL,
[open_price] [smallmoney] NULL,
[close_date] [smalldatetime] NULL,
[close_price] [smallmoney] NULL,
[strategy_id] [int] NULL,
[flag] [tinyint] NULL
) ON [PRIMARY]


DECLARE @.C INT
SET @.C = 0
WHILE (@.C < 13000)
BEGIN
INSERT INTO test2 (quote_id, max_period, profit_target) values (@.C * 290, 1, 1.0)
SET @.C = @.C + 1
END

SET @.C = 0
WHILE (@.C < 3800000)
BEGIN
INSERT INTO [test].[dbo].[test1]
([quote_id]
,[ClSMA3]
,[ClSMA5]
,[ClSMA10]
,[ClSMA25]
,[ClSMA50]
,[ClMin3]
,[ClMin5]
,[ClMin10]
,[ClMin25]
,[ClMin50]
,[ClMax3]
,[ClMax5]
,[ClMax10]
,[ClMax25]
,[ClMax50]
,[LoMin3]
,[LoMin5]
,[LoMin10]
,[LoMin25]
,[LoMin50]
,[HiMax3]
,[HiMax5]
,[HiMax10]
,[HiMax25]
,[HiMax50]
,[LoMinBar3]
,[LoMinBar5]
,[LoMinBar10]
,[LoMinBar25]
,[LoMinBar50]
,[HiMaxBar3]
,[HiMaxBar5]
,[HiMaxBar10]
,[HiMaxBar25]
,[HiMaxBar50])
VALUES
(@.C
,@.C/2
,@.C/3
,@.C/4
,@.C/5
,@.C/6
,@.C/7
,@.C/8
,@.C/9
,@.C/10
,@.C/11
,@.C/12
,@.C/13
,@.C/14
,@.C/15
,@.C/16
,@.C/17
,@.C/18
,@.C/19
,@.C/20
,@.C/21
,@.C/22
,@.C/23
,@.C/24
,@.C/25
,@.C/26
,@.C/27
,@.C/28
,@.C/29
,@.C/30
,@.C/31
,@.C/32
,@.C/33
,@.C/34
,@.C/35
,@.C/36)
SET @.C = @.C + 1
END

--

The query below takes too long to compile (10 seconds in my environment), however in 2000 returns in less than a second

select count(test2.quote_id)
from test2
where
test2.quote_id in (select quote_id from test1)
and test2.quote_id - 1 in (select quote_id from test1)
and test2.quote_id - 2 in (select quote_id from test1)
and test2.quote_id - 3 in (select quote_id from test1)
and test2.quote_id - 4 in (select quote_id from test1)
and test2.quote_id - 5 in (select quote_id from test1)

The query below takes forever to execute. Unless you change index pk_test1 from CLUSTERED to NONCLUSTERED. In such case it executes in fraction of a second...

select test2.quote_id
from test2
inner join test1 on test1.quote_id >= test2.quote_id - 6 AND test1.quote_id <= test2.quote_id
group by test2.quote_id
having count(*) = 7

|||

Hi, Andy! I'm glad you've provided a repro case; I was about to write a long note full of questions about your tables, row counts, indexes, and so on.

While I'm waiting for the INSERT script to run, I'm wondering if it is by design that your repro script doesn't have any indexes on test2.

|||

Mike Blaszczak wrote:

While I'm waiting for the INSERT script to run, I'm wondering if it is by design that your repro script doesn't have any indexes on test2.

Hi Mike,

Thank you very much for your help. As to your question, in my original schema I had clustered index for quote_id column on test2 table, but it doesn't seem to affect those queries, this is why I didn't include it. Especially since there are not too many records (13K) in this table.

|||

Record count isn't really relevant; you want to have an index to provide a better access path. If you have an index on the quote_id field of [test2] and the server needs to scan all the quote_id fields, it can do so by reading the index and densely reading only the quote_id fields. Without the index, it has to read the table pages which include a lot of other data besides the needed quote_id field. It does a bunch more I/O that way.

Unfortunately, my machine to test things for other people is dying; the C drive is starting to throw read errors, and so I'm not sure how much more I'll be able to help.

Your queries degenerate to six-way joins. What I believe is happening is that, in 2005, we see a very high query cost so the QO tries to run longer in order to look for a better plan -- one that's cheaper. The QO spends its time looking at all the alternatives to try to find a better way through the query.

It's disappointing that there so much time spent compiling for your query, but anyone who's benefited from the QO eventually finding a better plan certainly thinks that the extra compile time is a feature.

You've apparently tried to find a better way though the query, yourself; writing different SQL that should get the same result. I don't think there's anything you can do to make the query look better--there's no way to help the server know the correlation between the query_id values in one row and the query_id-6 values in the other rows. I've played around with adding computed columns to hold the range, then creating an index on those:

ALTER TABLE TEST2 ADD
MinMatch AS (Quote_ID - 5),
MaxMatch AS (Quote_ID)

select test2.quote_id
from test2
inner join test1 on test1.quote_id between test2.MinMatch AND test2.MaxMatch

but I'm afraid I can't say if it is performing any better as the machine I've been using to play with my queries has a failing drive, and that makes any measurements I take rather invalid.

You might try investigating a query that uses a loop and a cursor; this may be one instance where you can pull off better performance with a cursor than with set-based operations. Or, see what you can do with your model.

Also, please be sure you have current statistics on the tables. I've seen some bad cardinality estimates on my machine, but I can't investigate further with the failing drive and all ...

|||

Thank you so much Mike for such detailed analysis.

yes record count does matter, as my queries take too long to compile only in case if there's many records in test2 table. What I'm going to do is switch back to SQL Server 2000 as I don't really need new features that come with 2005 but those performance issues are killing me.

Sorry to hear you've got hardware problems.

Thanks again and good luck

Monday, February 20, 2012

MS SQL - Query Analyzer by IP range

I need to run reports from our database that stores IP addresses when a software license is used. So, I need to know how to run SQL queries to retrieve certain IP addresses from specified ranges. For example:

192.168.1.0 - 192.168.2.255

192.168.3.11 - 192.168.7.128

10.10.10.8 - 10.10.12.255

What commands would help me accomplish pulling all IP addresses in a range. I know I would have to run multiple reports for each range. I tried using the "between" function and "less than/greater than" functions, but I received mixed results. For example:

SELECT * FROM testtable WHERE ip LIKE '192.168.[1-2].[0-255]'

I also tried -

SELECT * FROM testtable WHERE ip>'192.168.1.0' and ip<'192.168.2.255'

Any help would be greatly appreciated! Thanks in advance!Parse your IP addresses into four separate columns. That would make it easy to search or filter by ranges.|||blindman, thank you for the feedback... however, the original DB configured the IP address as it's own field... at this point, there is over 10million or so records, so modifying the DB would be time consuming... any other thoughts would be greatly appreciated...|||Create a function called IP2DWord or so. Its pretty straigtforward. First octet * 256^3 + second octet * 256^2 + third octet*256 + fourth octet. If you use this function on both starting and ending address you can easily select ranges.|||Define: "time consuming"? ;)
There have been a number of questions about IP addresses in a range - I suggest you search the forum www.dbforums.com/search.php and there were some very good suggestions around and about (see above ;)).|||roac, i looked into your suggestion and it seems to be the way to go... thnx all for the feedback...|||If you parse the IP address on the fly using a function, then you can toss any indexing out the window. On 10,000,000+ records don't expect a great response time.
You asked for advice, and I gave it. You are FAR better off parsing the IP Address into four separate columns in your table. You can index them, and you only need to parse each address once. If you absolutely have to have the IP Address as a single column in your table, then keep it and create four additional columns in the table that you populate automatically using insert/update triggers.
You can do this right, or you can do this wrong. You can take shortcuts now, or you can save programming, debugging, and execution time over the life of the application.
'Nuff said.|||For reporting purposes it is required for the IP address to be in 1 column. I will look into separating each IP segment into it's on column, as I agree there's more flexibility when running reports specifically for IP ranges.|||For reporting purposes it is required for the IP address to be in 1 column.Reporting purposes do not require the data to be in one column. A view, sproc, or SQL statement can easily combine the components for reporting.