Sql Comparing two tables: Computing hash values

Hi,
In hash joins, how the hash value is computed? For example in this query:
SET SHOWPLAN_ALL ON
select c.customerid ,o.orderid, o.shipcountry from
customers c right outer join orders o
on c.customerid=o.customerid
and o.shipcountry='germany'
How the fields those appear in HASH:() predicate help to create hash values?
I think my problem is that I don't know that what the hash value is.
Thanks,
LeilaHi Leila
For your query tuning, it shouldn't matter what the actual hash values are.
If possible, you should try to build an index that will allow SQL Server to
perform a different join technique than hashing.
Microsoft does not document any details of the hash functions they use for
processing hash join operations. If you want to know more about hashing in
general, read "The Art of Computer Programming -- Volume 3: Sorting and
Searching" by Donald Knuth.
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"Leila" <lelas@.hotpop.com> wrote in message
news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> Hi,
> In hash joins, how the hash value is computed? For example in this query:
> SET SHOWPLAN_ALL ON
> select c.customerid ,o.orderid, o.shipcountry from
> customers c right outer join orders o
> on c.customerid=o.customerid
> and o.shipcountry='germany'
> How the fields those appear in HASH:() predicate help to create hash
> values?
> I think my problem is that I don't know that what the hash value is.
> Thanks,
> Leila
>|||Hi Kalen,
Thanks for your suggestion.
I'm a little confused about the difference between Hash Match and Nested
Loops. As far as I learned from BOL, in Hash Match, the hash values are
moved from the base table to a new place in memory(called hash table), then
an operation like nested loop happens between hash table and another table.
In nested loops, no value is moved from the base table, instead the loop
begins (with no hash table in between) directly with other table.
It seems the only difference is the existence of hash table in between, is
that true?
Thanks again,
Leila
"Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
> Hi Leila
> For your query tuning, it shouldn't matter what the actual hash values
are.
> If possible, you should try to build an index that will allow SQL Server
to
> perform a different join technique than hashing.
> Microsoft does not document any details of the hash functions they use for
> processing hash join operations. If you want to know more about hashing in
> general, read "The Art of Computer Programming -- Volume 3: Sorting and
> Searching" by Donald Knuth.
> --
> HTH
> --
> Kalen Delaney
> SQL Server MVP
> www.SolidQualityLearning.com
>
> "Leila" <lelas@.hotpop.com> wrote in message
> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> > Hi,
> > In hash joins, how the hash value is computed? For example in this
query:
> >
> > SET SHOWPLAN_ALL ON
> > select c.customerid ,o.orderid, o.shipcountry from
> > customers c right outer join orders o
> > on c.customerid=o.customerid
> > and o.shipcountry='germany'
> >
> > How the fields those appear in HASH:() predicate help to create hash
> > values?
> > I think my problem is that I don't know that what the hash value is.
> > Thanks,
> > Leila
> >
> >
>|||>Leila" <lelas@.hotpop.com> wrote in message
news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> I'm a little confused about the difference between Hash Match and Nested
> Loops. As far as I learned from BOL, in Hash Match, the hash values are
> moved from the base table to a new place in memory(called hash table),
then
> an operation like nested loop happens between hash table and another
table.
> In nested loops, no value is moved from the base table, instead the loop
> begins (with no hash table in between) directly with other table.
> It seems the only difference is the existence of hash table in between, is
> that true?
In a nested loop, the inner loop is executed once for each outer loop. In a
hash match, the top ("build") input is created, then the bottom ("probe")
input is matched against it. This means that the bottom table is only
scanned once.|||The 'only' difference is a very expensive one.
If you have an index, SQL Server can take a value from the outer table and
use the index to find matching rows in the inner table.
With a hash match, which is used because there IS no useful index, the data
in the inner table is organized into a hash table, so that SQL Server can
find matching rows using the hash table instead of an index.
Al though the inner table is scanned only once, the process of building the
hash table is resource intensive, and the hash table uses a lot of memory
for a big table.
You're better off building a good index to make the nested loops possible.
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"Leila" <lelas@.hotpop.com> wrote in message
news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> Hi Kalen,
> Thanks for your suggestion.
> I'm a little confused about the difference between Hash Match and Nested
> Loops. As far as I learned from BOL, in Hash Match, the hash values are
> moved from the base table to a new place in memory(called hash table),
> then
> an operation like nested loop happens between hash table and another
> table.
> In nested loops, no value is moved from the base table, instead the loop
> begins (with no hash table in between) directly with other table.
> It seems the only difference is the existence of hash table in between, is
> that true?
> Thanks again,
> Leila
>
> "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
>> Hi Leila
>> For your query tuning, it shouldn't matter what the actual hash values
> are.
>> If possible, you should try to build an index that will allow SQL Server
> to
>> perform a different join technique than hashing.
>> Microsoft does not document any details of the hash functions they use
>> for
>> processing hash join operations. If you want to know more about hashing
>> in
>> general, read "The Art of Computer Programming -- Volume 3: Sorting and
>> Searching" by Donald Knuth.
>> --
>> HTH
>> --
>> Kalen Delaney
>> SQL Server MVP
>> www.SolidQualityLearning.com
>>
>> "Leila" <lelas@.hotpop.com> wrote in message
>> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
>> > Hi,
>> > In hash joins, how the hash value is computed? For example in this
> query:
>> >
>> > SET SHOWPLAN_ALL ON
>> > select c.customerid ,o.orderid, o.shipcountry from
>> > customers c right outer join orders o
>> > on c.customerid=o.customerid
>> > and o.shipcountry='germany'
>> >
>> > How the fields those appear in HASH:() predicate help to create hash
>> > values?
>> > I think my problem is that I don't know that what the hash value is.
>> > Thanks,
>> > Leila
>> >
>> >
>>
>
>|||Hi Mark,
I cannot understand that how the matching can be performed with one scan?
Maybe because yet I don't know about the real contents of hash table (hash
values).
Could you please help me.
Thanks,
Leila
"Mark Wilden" <mark@.mwilden.com> wrote in message
news:saGdncQrlOIWgc_cRVn-jA@.sti.net...
> >Leila" <lelas@.hotpop.com> wrote in message
> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> > I'm a little confused about the difference between Hash Match and Nested
> > Loops. As far as I learned from BOL, in Hash Match, the hash values are
> > moved from the base table to a new place in memory(called hash table),
> then
> > an operation like nested loop happens between hash table and another
> table.
> > In nested loops, no value is moved from the base table, instead the loop
> > begins (with no hash table in between) directly with other table.
> > It seems the only difference is the existence of hash table in between,
is
> > that true?
> In a nested loop, the inner loop is executed once for each outer loop. In
a
> hash match, the top ("build") input is created, then the bottom ("probe")
> input is matched against it. This means that the bottom table is only
> scanned once.
>|||Thanks Mark,
I think I got it! You mean the bottom table is scanned once (for creating
hash table) and then nested loop is needed for matching rows. Is that true?
"Mark Wilden" <mark@.mwilden.com> wrote in message
news:saGdncQrlOIWgc_cRVn-jA@.sti.net...
> >Leila" <lelas@.hotpop.com> wrote in message
> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> > I'm a little confused about the difference between Hash Match and Nested
> > Loops. As far as I learned from BOL, in Hash Match, the hash values are
> > moved from the base table to a new place in memory(called hash table),
> then
> > an operation like nested loop happens between hash table and another
> table.
> > In nested loops, no value is moved from the base table, instead the loop
> > begins (with no hash table in between) directly with other table.
> > It seems the only difference is the existence of hash table in between,
is
> > that true?
> In a nested loop, the inner loop is executed once for each outer loop. In
a
> hash match, the top ("build") input is created, then the bottom ("probe")
> input is matched against it. This means that the bottom table is only
> scanned once.
>|||"Leila" <lelas@.hotpop.com> wrote in message
news:%23ipgFuQoEHA.2140@.TK2MSFTNGP11.phx.gbl...
> I cannot understand that how the matching can be performed with one scan?
> Maybe because yet I don't know about the real contents of hash table (hash
> values).
Kalen is a much better person to explain this than I am. However, to
clarify, there are two scans - one scan to create the hash table in the
first place from the contents of the upper or outer table, and another scan
to match the lower or inner table against this hash table.
For example:
select id from A join B on A.something = B.somethingElse
If a hash match were used, this would look at all the A.something values and
create a hash table from them. For example, if A.something = "Mark's the
best", there might be a hash value created from it like 123. Another row
might contain "Kate's better", and that might hash to a different number,
like 342.
Having created the "build" hash table from A, B is then scanned, creating
hash values from the B.somethingElse column. Each of those values is used as
a "probe" into the original hash table to see if there is a match - i.e., if
A.something really does equal B.somethingElse.
To answer your question, it doesn't matter what hash value is generated for
"Mark's the best". Hashing is just a way of reducing a large number of
possible values to a smaller number.
I hope this didn't make it worse!|||"Leila" <lelas@.hotpop.com> wrote in message
news:uvRR81QoEHA.1160@.tk2msftngp13.phx.gbl...
> I think I got it! You mean the bottom table is scanned once (for creating
> hash table) and then nested loop is needed for matching rows. Is that
true?
We're so close - and I'm honestly looking forward to what Kalen has to say.
:)
The top table is scanned once to create the table, then the bottom table is
scanned once (not in a nested loop) and matched against the table.|||Thanks Kalen!
You mentioned 'the data in the inner table is organized into a hash table'.
I read in BOL 'the smaller of the two inputs is the build input'.
Are they different?
"Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
> The 'only' difference is a very expensive one.
> If you have an index, SQL Server can take a value from the outer table and
> use the index to find matching rows in the inner table.
> With a hash match, which is used because there IS no useful index, the
data
> in the inner table is organized into a hash table, so that SQL Server can
> find matching rows using the hash table instead of an index.
> Al though the inner table is scanned only once, the process of building
the
> hash table is resource intensive, and the hash table uses a lot of memory
> for a big table.
> You're better off building a good index to make the nested loops possible.
> --
> HTH
> --
> Kalen Delaney
> SQL Server MVP
> www.SolidQualityLearning.com
>
> "Leila" <lelas@.hotpop.com> wrote in message
> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> > Hi Kalen,
> > Thanks for your suggestion.
> > I'm a little confused about the difference between Hash Match and Nested
> > Loops. As far as I learned from BOL, in Hash Match, the hash values are
> > moved from the base table to a new place in memory(called hash table),
> > then
> > an operation like nested loop happens between hash table and another
> > table.
> > In nested loops, no value is moved from the base table, instead the loop
> > begins (with no hash table in between) directly with other table.
> > It seems the only difference is the existence of hash table in between,
is
> > that true?
> > Thanks again,
> > Leila
> >
> >
> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
> >> Hi Leila
> >>
> >> For your query tuning, it shouldn't matter what the actual hash values
> > are.
> >> If possible, you should try to build an index that will allow SQL
Server
> > to
> >> perform a different join technique than hashing.
> >>
> >> Microsoft does not document any details of the hash functions they use
> >> for
> >> processing hash join operations. If you want to know more about hashing
> >> in
> >> general, read "The Art of Computer Programming -- Volume 3: Sorting and
> >> Searching" by Donald Knuth.
> >>
> >> --
> >> HTH
> >> --
> >> Kalen Delaney
> >> SQL Server MVP
> >> www.SolidQualityLearning.com
> >>
> >>
> >> "Leila" <lelas@.hotpop.com> wrote in message
> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> >> > Hi,
> >> > In hash joins, how the hash value is computed? For example in this
> > query:
> >> >
> >> > SET SHOWPLAN_ALL ON
> >> > select c.customerid ,o.orderid, o.shipcountry from
> >> > customers c right outer join orders o
> >> > on c.customerid=o.customerid
> >> > and o.shipcountry='germany'
> >> >
> >> > How the fields those appear in HASH:() predicate help to create hash
> >> > values?
> >> > I think my problem is that I don't know that what the hash value is.
> >> > Thanks,
> >> > Leila
> >> >
> >> >
> >>
> >>
> >
> >
> >
> >
>|||It really clarified the issue. Thank you very much indeed!
"Mark Wilden" <mark@.mwilden.com> wrote in message
news:yvCdnQJbXZiYtM_cRVn-jA@.sti.net...
> "Leila" <lelas@.hotpop.com> wrote in message
> news:%23ipgFuQoEHA.2140@.TK2MSFTNGP11.phx.gbl...
> > I cannot understand that how the matching can be performed with one
scan?
> > Maybe because yet I don't know about the real contents of hash table
(hash
> > values).
> Kalen is a much better person to explain this than I am. However, to
> clarify, there are two scans - one scan to create the hash table in the
> first place from the contents of the upper or outer table, and another
scan
> to match the lower or inner table against this hash table.
> For example:
> select id from A join B on A.something = B.somethingElse
> If a hash match were used, this would look at all the A.something values
and
> create a hash table from them. For example, if A.something = "Mark's the
> best", there might be a hash value created from it like 123. Another row
> might contain "Kate's better", and that might hash to a different number,
> like 342.
> Having created the "build" hash table from A, B is then scanned, creating
> hash values from the B.somethingElse column. Each of those values is used
as
> a "probe" into the original hash table to see if there is a match - i.e.,
if
> A.something really does equal B.somethingElse.
> To answer your question, it doesn't matter what hash value is generated
for
> "Mark's the best". Hashing is just a way of reducing a large number of
> possible values to a smaller number.
> I hope this didn't make it worse!
>|||The 'inner' table is whichever one is chosen by the SQL Server optimizer to
build the hash table. Typically this will be the smaller one, but not
always.
For BOL to say the smaller of the two is the build input is a bit of an
overgeneralization.
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"Leila" <lelas@.hotpop.com> wrote in message
news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
> Thanks Kalen!
> You mentioned 'the data in the inner table is organized into a hash
> table'.
> I read in BOL 'the smaller of the two inputs is the build input'.
> Are they different?
>
>
> "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
>> The 'only' difference is a very expensive one.
>> If you have an index, SQL Server can take a value from the outer table
>> and
>> use the index to find matching rows in the inner table.
>> With a hash match, which is used because there IS no useful index, the
> data
>> in the inner table is organized into a hash table, so that SQL Server can
>> find matching rows using the hash table instead of an index.
>> Al though the inner table is scanned only once, the process of building
> the
>> hash table is resource intensive, and the hash table uses a lot of memory
>> for a big table.
>> You're better off building a good index to make the nested loops
>> possible.
>> --
>> HTH
>> --
>> Kalen Delaney
>> SQL Server MVP
>> www.SolidQualityLearning.com
>>
>> "Leila" <lelas@.hotpop.com> wrote in message
>> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
>> > Hi Kalen,
>> > Thanks for your suggestion.
>> > I'm a little confused about the difference between Hash Match and
>> > Nested
>> > Loops. As far as I learned from BOL, in Hash Match, the hash values are
>> > moved from the base table to a new place in memory(called hash table),
>> > then
>> > an operation like nested loop happens between hash table and another
>> > table.
>> > In nested loops, no value is moved from the base table, instead the
>> > loop
>> > begins (with no hash table in between) directly with other table.
>> > It seems the only difference is the existence of hash table in between,
> is
>> > that true?
>> > Thanks again,
>> > Leila
>> >
>> >
>> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
>> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
>> >> Hi Leila
>> >>
>> >> For your query tuning, it shouldn't matter what the actual hash values
>> > are.
>> >> If possible, you should try to build an index that will allow SQL
> Server
>> > to
>> >> perform a different join technique than hashing.
>> >>
>> >> Microsoft does not document any details of the hash functions they use
>> >> for
>> >> processing hash join operations. If you want to know more about
>> >> hashing
>> >> in
>> >> general, read "The Art of Computer Programming -- Volume 3: Sorting
>> >> and
>> >> Searching" by Donald Knuth.
>> >>
>> >> --
>> >> HTH
>> >> --
>> >> Kalen Delaney
>> >> SQL Server MVP
>> >> www.SolidQualityLearning.com
>> >>
>> >>
>> >> "Leila" <lelas@.hotpop.com> wrote in message
>> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
>> >> > Hi,
>> >> > In hash joins, how the hash value is computed? For example in this
>> > query:
>> >> >
>> >> > SET SHOWPLAN_ALL ON
>> >> > select c.customerid ,o.orderid, o.shipcountry from
>> >> > customers c right outer join orders o
>> >> > on c.customerid=o.customerid
>> >> > and o.shipcountry='germany'
>> >> >
>> >> > How the fields those appear in HASH:() predicate help to create hash
>> >> > values?
>> >> > I think my problem is that I don't know that what the hash value is.
>> >> > Thanks,
>> >> > Leila
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>> >
>> >
>>
>|||Kalen,
When the hash table is ready, will there be something like nested loop to
match rows? Because Mark described that the bottom table is
scanned once (not in a nested loop).
Leila
"Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
news:u26#jDRoEHA.2900@.TK2MSFTNGP12.phx.gbl...
> The 'inner' table is whichever one is chosen by the SQL Server optimizer
to
> build the hash table. Typically this will be the smaller one, but not
> always.
> For BOL to say the smaller of the two is the build input is a bit of an
> overgeneralization.
> --
> HTH
> --
> Kalen Delaney
> SQL Server MVP
> www.SolidQualityLearning.com
>
> "Leila" <lelas@.hotpop.com> wrote in message
> news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
> > Thanks Kalen!
> > You mentioned 'the data in the inner table is organized into a hash
> > table'.
> > I read in BOL 'the smaller of the two inputs is the build input'.
> > Are they different?
> >
> >
> >
> >
> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
> >> The 'only' difference is a very expensive one.
> >> If you have an index, SQL Server can take a value from the outer table
> >> and
> >> use the index to find matching rows in the inner table.
> >>
> >> With a hash match, which is used because there IS no useful index, the
> > data
> >> in the inner table is organized into a hash table, so that SQL Server
can
> >> find matching rows using the hash table instead of an index.
> >> Al though the inner table is scanned only once, the process of building
> > the
> >> hash table is resource intensive, and the hash table uses a lot of
memory
> >> for a big table.
> >>
> >> You're better off building a good index to make the nested loops
> >> possible.
> >>
> >> --
> >> HTH
> >> --
> >> Kalen Delaney
> >> SQL Server MVP
> >> www.SolidQualityLearning.com
> >>
> >>
> >> "Leila" <lelas@.hotpop.com> wrote in message
> >> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> >> > Hi Kalen,
> >> > Thanks for your suggestion.
> >> > I'm a little confused about the difference between Hash Match and
> >> > Nested
> >> > Loops. As far as I learned from BOL, in Hash Match, the hash values
are
> >> > moved from the base table to a new place in memory(called hash
table),
> >> > then
> >> > an operation like nested loop happens between hash table and another
> >> > table.
> >> > In nested loops, no value is moved from the base table, instead the
> >> > loop
> >> > begins (with no hash table in between) directly with other table.
> >> > It seems the only difference is the existence of hash table in
between,
> > is
> >> > that true?
> >> > Thanks again,
> >> > Leila
> >> >
> >> >
> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> >> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
> >> >> Hi Leila
> >> >>
> >> >> For your query tuning, it shouldn't matter what the actual hash
values
> >> > are.
> >> >> If possible, you should try to build an index that will allow SQL
> > Server
> >> > to
> >> >> perform a different join technique than hashing.
> >> >>
> >> >> Microsoft does not document any details of the hash functions they
use
> >> >> for
> >> >> processing hash join operations. If you want to know more about
> >> >> hashing
> >> >> in
> >> >> general, read "The Art of Computer Programming -- Volume 3: Sorting
> >> >> and
> >> >> Searching" by Donald Knuth.
> >> >>
> >> >> --
> >> >> HTH
> >> >> --
> >> >> Kalen Delaney
> >> >> SQL Server MVP
> >> >> www.SolidQualityLearning.com
> >> >>
> >> >>
> >> >> "Leila" <lelas@.hotpop.com> wrote in message
> >> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> >> >> > Hi,
> >> >> > In hash joins, how the hash value is computed? For example in this
> >> > query:
> >> >> >
> >> >> > SET SHOWPLAN_ALL ON
> >> >> > select c.customerid ,o.orderid, o.shipcountry from
> >> >> > customers c right outer join orders o
> >> >> > on c.customerid=o.customerid
> >> >> > and o.shipcountry='germany'
> >> >> >
> >> >> > How the fields those appear in HASH:() predicate help to create
hash
> >> >> > values?
> >> >> > I think my problem is that I don't know that what the hash value
is.
> >> >> > Thanks,
> >> >> > Leila
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >
> >
>|||A nested loop is when the inner table is processed completely for each row
of the outer table.
For hash joins the inner table is read once to build the hash table, and
then not touched again. Then each row of the outer table leads to a single
access of the hash table.
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"Leila" <lelas@.hotpop.com> wrote in message
news:OfY1jORoEHA.3760@.TK2MSFTNGP12.phx.gbl...
> Kalen,
> When the hash table is ready, will there be something like nested loop to
> match rows? Because Mark described that the bottom table is
> scanned once (not in a nested loop).
> Leila
>
> "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> news:u26#jDRoEHA.2900@.TK2MSFTNGP12.phx.gbl...
>> The 'inner' table is whichever one is chosen by the SQL Server optimizer
> to
>> build the hash table. Typically this will be the smaller one, but not
>> always.
>> For BOL to say the smaller of the two is the build input is a bit of an
>> overgeneralization.
>> --
>> HTH
>> --
>> Kalen Delaney
>> SQL Server MVP
>> www.SolidQualityLearning.com
>>
>> "Leila" <lelas@.hotpop.com> wrote in message
>> news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
>> > Thanks Kalen!
>> > You mentioned 'the data in the inner table is organized into a hash
>> > table'.
>> > I read in BOL 'the smaller of the two inputs is the build input'.
>> > Are they different?
>> >
>> >
>> >
>> >
>> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
>> > news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
>> >> The 'only' difference is a very expensive one.
>> >> If you have an index, SQL Server can take a value from the outer table
>> >> and
>> >> use the index to find matching rows in the inner table.
>> >>
>> >> With a hash match, which is used because there IS no useful index, the
>> > data
>> >> in the inner table is organized into a hash table, so that SQL Server
> can
>> >> find matching rows using the hash table instead of an index.
>> >> Al though the inner table is scanned only once, the process of
>> >> building
>> > the
>> >> hash table is resource intensive, and the hash table uses a lot of
> memory
>> >> for a big table.
>> >>
>> >> You're better off building a good index to make the nested loops
>> >> possible.
>> >>
>> >> --
>> >> HTH
>> >> --
>> >> Kalen Delaney
>> >> SQL Server MVP
>> >> www.SolidQualityLearning.com
>> >>
>> >>
>> >> "Leila" <lelas@.hotpop.com> wrote in message
>> >> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
>> >> > Hi Kalen,
>> >> > Thanks for your suggestion.
>> >> > I'm a little confused about the difference between Hash Match and
>> >> > Nested
>> >> > Loops. As far as I learned from BOL, in Hash Match, the hash values
> are
>> >> > moved from the base table to a new place in memory(called hash
> table),
>> >> > then
>> >> > an operation like nested loop happens between hash table and another
>> >> > table.
>> >> > In nested loops, no value is moved from the base table, instead the
>> >> > loop
>> >> > begins (with no hash table in between) directly with other table.
>> >> > It seems the only difference is the existence of hash table in
> between,
>> > is
>> >> > that true?
>> >> > Thanks again,
>> >> > Leila
>> >> >
>> >> >
>> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
>> >> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
>> >> >> Hi Leila
>> >> >>
>> >> >> For your query tuning, it shouldn't matter what the actual hash
> values
>> >> > are.
>> >> >> If possible, you should try to build an index that will allow SQL
>> > Server
>> >> > to
>> >> >> perform a different join technique than hashing.
>> >> >>
>> >> >> Microsoft does not document any details of the hash functions they
> use
>> >> >> for
>> >> >> processing hash join operations. If you want to know more about
>> >> >> hashing
>> >> >> in
>> >> >> general, read "The Art of Computer Programming -- Volume 3: Sorting
>> >> >> and
>> >> >> Searching" by Donald Knuth.
>> >> >>
>> >> >> --
>> >> >> HTH
>> >> >> --
>> >> >> Kalen Delaney
>> >> >> SQL Server MVP
>> >> >> www.SolidQualityLearning.com
>> >> >>
>> >> >>
>> >> >> "Leila" <lelas@.hotpop.com> wrote in message
>> >> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
>> >> >> > Hi,
>> >> >> > In hash joins, how the hash value is computed? For example in
>> >> >> > this
>> >> > query:
>> >> >> >
>> >> >> > SET SHOWPLAN_ALL ON
>> >> >> > select c.customerid ,o.orderid, o.shipcountry from
>> >> >> > customers c right outer join orders o
>> >> >> > on c.customerid=o.customerid
>> >> >> > and o.shipcountry='germany'
>> >> >> >
>> >> >> > How the fields those appear in HASH:() predicate help to create
> hash
>> >> >> > values?
>> >> >> > I think my problem is that I don't know that what the hash value
> is.
>> >> >> > Thanks,
>> >> >> > Leila
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>|||Does the hash table have an strucnture like index? If it doesn't, I think
nested loop is inevitable for matching rows between hash table and the probe
table.
"Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
news:eQJ6HbRoEHA.2108@.TK2MSFTNGP10.phx.gbl...
> A nested loop is when the inner table is processed completely for each
row
> of the outer table.
> For hash joins the inner table is read once to build the hash table, and
> then not touched again. Then each row of the outer table leads to a single
> access of the hash table.
> --
> HTH
> --
> Kalen Delaney
> SQL Server MVP
> www.SolidQualityLearning.com
>
> "Leila" <lelas@.hotpop.com> wrote in message
> news:OfY1jORoEHA.3760@.TK2MSFTNGP12.phx.gbl...
> > Kalen,
> > When the hash table is ready, will there be something like nested loop
to
> > match rows? Because Mark described that the bottom table is
> > scanned once (not in a nested loop).
> > Leila
> >
> >
> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > news:u26#jDRoEHA.2900@.TK2MSFTNGP12.phx.gbl...
> >> The 'inner' table is whichever one is chosen by the SQL Server
optimizer
> > to
> >> build the hash table. Typically this will be the smaller one, but not
> >> always.
> >> For BOL to say the smaller of the two is the build input is a bit of an
> >> overgeneralization.
> >>
> >> --
> >> HTH
> >> --
> >> Kalen Delaney
> >> SQL Server MVP
> >> www.SolidQualityLearning.com
> >>
> >>
> >> "Leila" <lelas@.hotpop.com> wrote in message
> >> news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
> >> > Thanks Kalen!
> >> > You mentioned 'the data in the inner table is organized into a hash
> >> > table'.
> >> > I read in BOL 'the smaller of the two inputs is the build input'.
> >> > Are they different?
> >> >
> >> >
> >> >
> >> >
> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> >> > news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
> >> >> The 'only' difference is a very expensive one.
> >> >> If you have an index, SQL Server can take a value from the outer
table
> >> >> and
> >> >> use the index to find matching rows in the inner table.
> >> >>
> >> >> With a hash match, which is used because there IS no useful index,
the
> >> > data
> >> >> in the inner table is organized into a hash table, so that SQL
Server
> > can
> >> >> find matching rows using the hash table instead of an index.
> >> >> Al though the inner table is scanned only once, the process of
> >> >> building
> >> > the
> >> >> hash table is resource intensive, and the hash table uses a lot of
> > memory
> >> >> for a big table.
> >> >>
> >> >> You're better off building a good index to make the nested loops
> >> >> possible.
> >> >>
> >> >> --
> >> >> HTH
> >> >> --
> >> >> Kalen Delaney
> >> >> SQL Server MVP
> >> >> www.SolidQualityLearning.com
> >> >>
> >> >>
> >> >> "Leila" <lelas@.hotpop.com> wrote in message
> >> >> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> >> >> > Hi Kalen,
> >> >> > Thanks for your suggestion.
> >> >> > I'm a little confused about the difference between Hash Match and
> >> >> > Nested
> >> >> > Loops. As far as I learned from BOL, in Hash Match, the hash
values
> > are
> >> >> > moved from the base table to a new place in memory(called hash
> > table),
> >> >> > then
> >> >> > an operation like nested loop happens between hash table and
another
> >> >> > table.
> >> >> > In nested loops, no value is moved from the base table, instead
the
> >> >> > loop
> >> >> > begins (with no hash table in between) directly with other table.
> >> >> > It seems the only difference is the existence of hash table in
> > between,
> >> > is
> >> >> > that true?
> >> >> > Thanks again,
> >> >> > Leila
> >> >> >
> >> >> >
> >> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> >> >> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
> >> >> >> Hi Leila
> >> >> >>
> >> >> >> For your query tuning, it shouldn't matter what the actual hash
> > values
> >> >> > are.
> >> >> >> If possible, you should try to build an index that will allow SQL
> >> > Server
> >> >> > to
> >> >> >> perform a different join technique than hashing.
> >> >> >>
> >> >> >> Microsoft does not document any details of the hash functions
they
> > use
> >> >> >> for
> >> >> >> processing hash join operations. If you want to know more about
> >> >> >> hashing
> >> >> >> in
> >> >> >> general, read "The Art of Computer Programming -- Volume 3:
Sorting
> >> >> >> and
> >> >> >> Searching" by Donald Knuth.
> >> >> >>
> >> >> >> --
> >> >> >> HTH
> >> >> >> --
> >> >> >> Kalen Delaney
> >> >> >> SQL Server MVP
> >> >> >> www.SolidQualityLearning.com
> >> >> >>
> >> >> >>
> >> >> >> "Leila" <lelas@.hotpop.com> wrote in message
> >> >> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> >> >> >> > Hi,
> >> >> >> > In hash joins, how the hash value is computed? For example in
> >> >> >> > this
> >> >> > query:
> >> >> >> >
> >> >> >> > SET SHOWPLAN_ALL ON
> >> >> >> > select c.customerid ,o.orderid, o.shipcountry from
> >> >> >> > customers c right outer join orders o
> >> >> >> > on c.customerid=o.customerid
> >> >> >> > and o.shipcountry='germany'
> >> >> >> >
> >> >> >> > How the fields those appear in HASH:() predicate help to create
> > hash
> >> >> >> > values?
> >> >> >> > I think my problem is that I don't know that what the hash
value
> > is.
> >> >> >> > Thanks,
> >> >> >> > Leila
> >> >> >> >
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >
> >
>|||For each row in the probe table, a hash value is calculated based on the join key. Then SQL Server
looks in the hash bucked from the build table to see if there is any match. The key (no pun
intended) here is that the build table is splitted up into a lot of buckets, and for the other
table, SQL server only have to look in a specific bucket to find if there's a match.
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"Leila" <lelas@.hotpop.com> wrote in message news:uRd8FNWoEHA.3488@.TK2MSFTNGP12.phx.gbl...
> Does the hash table have an strucnture like index? If it doesn't, I think
> nested loop is inevitable for matching rows between hash table and the probe
> table.
>
>
> "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> news:eQJ6HbRoEHA.2108@.TK2MSFTNGP10.phx.gbl...
> > A nested loop is when the inner table is processed completely for each
> row
> > of the outer table.
> >
> > For hash joins the inner table is read once to build the hash table, and
> > then not touched again. Then each row of the outer table leads to a single
> > access of the hash table.
> >
> > --
> > HTH
> > --
> > Kalen Delaney
> > SQL Server MVP
> > www.SolidQualityLearning.com
> >
> >
> > "Leila" <lelas@.hotpop.com> wrote in message
> > news:OfY1jORoEHA.3760@.TK2MSFTNGP12.phx.gbl...
> > > Kalen,
> > > When the hash table is ready, will there be something like nested loop
> to
> > > match rows? Because Mark described that the bottom table is
> > > scanned once (not in a nested loop).
> > > Leila
> > >
> > >
> > > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > > news:u26#jDRoEHA.2900@.TK2MSFTNGP12.phx.gbl...
> > >> The 'inner' table is whichever one is chosen by the SQL Server
> optimizer
> > > to
> > >> build the hash table. Typically this will be the smaller one, but not
> > >> always.
> > >> For BOL to say the smaller of the two is the build input is a bit of an
> > >> overgeneralization.
> > >>
> > >> --
> > >> HTH
> > >> --
> > >> Kalen Delaney
> > >> SQL Server MVP
> > >> www.SolidQualityLearning.com
> > >>
> > >>
> > >> "Leila" <lelas@.hotpop.com> wrote in message
> > >> news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
> > >> > Thanks Kalen!
> > >> > You mentioned 'the data in the inner table is organized into a hash
> > >> > table'.
> > >> > I read in BOL 'the smaller of the two inputs is the build input'.
> > >> > Are they different?
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > >> > news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
> > >> >> The 'only' difference is a very expensive one.
> > >> >> If you have an index, SQL Server can take a value from the outer
> table
> > >> >> and
> > >> >> use the index to find matching rows in the inner table.
> > >> >>
> > >> >> With a hash match, which is used because there IS no useful index,
> the
> > >> > data
> > >> >> in the inner table is organized into a hash table, so that SQL
> Server
> > > can
> > >> >> find matching rows using the hash table instead of an index.
> > >> >> Al though the inner table is scanned only once, the process of
> > >> >> building
> > >> > the
> > >> >> hash table is resource intensive, and the hash table uses a lot of
> > > memory
> > >> >> for a big table.
> > >> >>
> > >> >> You're better off building a good index to make the nested loops
> > >> >> possible.
> > >> >>
> > >> >> --
> > >> >> HTH
> > >> >> --
> > >> >> Kalen Delaney
> > >> >> SQL Server MVP
> > >> >> www.SolidQualityLearning.com
> > >> >>
> > >> >>
> > >> >> "Leila" <lelas@.hotpop.com> wrote in message
> > >> >> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> > >> >> > Hi Kalen,
> > >> >> > Thanks for your suggestion.
> > >> >> > I'm a little confused about the difference between Hash Match and
> > >> >> > Nested
> > >> >> > Loops. As far as I learned from BOL, in Hash Match, the hash
> values
> > > are
> > >> >> > moved from the base table to a new place in memory(called hash
> > > table),
> > >> >> > then
> > >> >> > an operation like nested loop happens between hash table and
> another
> > >> >> > table.
> > >> >> > In nested loops, no value is moved from the base table, instead
> the
> > >> >> > loop
> > >> >> > begins (with no hash table in between) directly with other table.
> > >> >> > It seems the only difference is the existence of hash table in
> > > between,
> > >> > is
> > >> >> > that true?
> > >> >> > Thanks again,
> > >> >> > Leila
> > >> >> >
> > >> >> >
> > >> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > >> >> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
> > >> >> >> Hi Leila
> > >> >> >>
> > >> >> >> For your query tuning, it shouldn't matter what the actual hash
> > > values
> > >> >> > are.
> > >> >> >> If possible, you should try to build an index that will allow SQL
> > >> > Server
> > >> >> > to
> > >> >> >> perform a different join technique than hashing.
> > >> >> >>
> > >> >> >> Microsoft does not document any details of the hash functions
> they
> > > use
> > >> >> >> for
> > >> >> >> processing hash join operations. If you want to know more about
> > >> >> >> hashing
> > >> >> >> in
> > >> >> >> general, read "The Art of Computer Programming -- Volume 3:
> Sorting
> > >> >> >> and
> > >> >> >> Searching" by Donald Knuth.
> > >> >> >>
> > >> >> >> --
> > >> >> >> HTH
> > >> >> >> --
> > >> >> >> Kalen Delaney
> > >> >> >> SQL Server MVP
> > >> >> >> www.SolidQualityLearning.com
> > >> >> >>
> > >> >> >>
> > >> >> >> "Leila" <lelas@.hotpop.com> wrote in message
> > >> >> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> > >> >> >> > Hi,
> > >> >> >> > In hash joins, how the hash value is computed? For example in
> > >> >> >> > this
> > >> >> > query:
> > >> >> >> >
> > >> >> >> > SET SHOWPLAN_ALL ON
> > >> >> >> > select c.customerid ,o.orderid, o.shipcountry from
> > >> >> >> > customers c right outer join orders o
> > >> >> >> > on c.customerid=o.customerid
> > >> >> >> > and o.shipcountry='germany'
> > >> >> >> >
> > >> >> >> > How the fields those appear in HASH:() predicate help to create
> > > hash
> > >> >> >> > values?
> > >> >> >> > I think my problem is that I don't know that what the hash
> value
> > > is.
> > >> >> >> > Thanks,
> > >> >> >> > Leila
> > >> >> >> >
> > >> >> >> >
> > >> >> >>
> > >> >> >>
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >>
> > >>
> > >
> > >
> >
> >
>|||Thanks Tibor!
What I cannot understand is that what the meaning of "calculating hash value
based on join key" is.
Because join key is only the name of two fields plus an operator between
them, it doesn't have any value itself (to be calculated).
Leila
"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:eZi#TaWoEHA.1776@.TK2MSFTNGP14.phx.gbl...
> For each row in the probe table, a hash value is calculated based on the
join key. Then SQL Server
> looks in the hash bucked from the build table to see if there is any
match. The key (no pun
> intended) here is that the build table is splitted up into a lot of
buckets, and for the other
> table, SQL server only have to look in a specific bucket to find if
there's a match.
> --
> Tibor Karaszi, SQL Server MVP
> http://www.karaszi.com/sqlserver/default.asp
> http://www.solidqualitylearning.com/
>
> "Leila" <lelas@.hotpop.com> wrote in message
news:uRd8FNWoEHA.3488@.TK2MSFTNGP12.phx.gbl...
> > Does the hash table have an strucnture like index? If it doesn't, I
think
> > nested loop is inevitable for matching rows between hash table and the
probe
> > table.
> >
> >
> >
> >
> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > news:eQJ6HbRoEHA.2108@.TK2MSFTNGP10.phx.gbl...
> > > A nested loop is when the inner table is processed completely for
each
> > row
> > > of the outer table.
> > >
> > > For hash joins the inner table is read once to build the hash table,
and
> > > then not touched again. Then each row of the outer table leads to a
single
> > > access of the hash table.
> > >
> > > --
> > > HTH
> > > --
> > > Kalen Delaney
> > > SQL Server MVP
> > > www.SolidQualityLearning.com
> > >
> > >
> > > "Leila" <lelas@.hotpop.com> wrote in message
> > > news:OfY1jORoEHA.3760@.TK2MSFTNGP12.phx.gbl...
> > > > Kalen,
> > > > When the hash table is ready, will there be something like nested
loop
> > to
> > > > match rows? Because Mark described that the bottom table is
> > > > scanned once (not in a nested loop).
> > > > Leila
> > > >
> > > >
> > > > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > > > news:u26#jDRoEHA.2900@.TK2MSFTNGP12.phx.gbl...
> > > >> The 'inner' table is whichever one is chosen by the SQL Server
> > optimizer
> > > > to
> > > >> build the hash table. Typically this will be the smaller one, but
not
> > > >> always.
> > > >> For BOL to say the smaller of the two is the build input is a bit
of an
> > > >> overgeneralization.
> > > >>
> > > >> --
> > > >> HTH
> > > >> --
> > > >> Kalen Delaney
> > > >> SQL Server MVP
> > > >> www.SolidQualityLearning.com
> > > >>
> > > >>
> > > >> "Leila" <lelas@.hotpop.com> wrote in message
> > > >> news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
> > > >> > Thanks Kalen!
> > > >> > You mentioned 'the data in the inner table is organized into a
hash
> > > >> > table'.
> > > >> > I read in BOL 'the smaller of the two inputs is the build input'.
> > > >> > Are they different?
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> > > >> > news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
> > > >> >> The 'only' difference is a very expensive one.
> > > >> >> If you have an index, SQL Server can take a value from the outer
> > table
> > > >> >> and
> > > >> >> use the index to find matching rows in the inner table.
> > > >> >>
> > > >> >> With a hash match, which is used because there IS no useful
index,
> > the
> > > >> > data
> > > >> >> in the inner table is organized into a hash table, so that SQL
> > Server
> > > > can
> > > >> >> find matching rows using the hash table instead of an index.
> > > >> >> Al though the inner table is scanned only once, the process of
> > > >> >> building
> > > >> > the
> > > >> >> hash table is resource intensive, and the hash table uses a lot
of
> > > > memory
> > > >> >> for a big table.
> > > >> >>
> > > >> >> You're better off building a good index to make the nested loops
> > > >> >> possible.
> > > >> >>
> > > >> >> --
> > > >> >> HTH
> > > >> >> --
> > > >> >> Kalen Delaney
> > > >> >> SQL Server MVP
> > > >> >> www.SolidQualityLearning.com
> > > >> >>
> > > >> >>
> > > >> >> "Leila" <lelas@.hotpop.com> wrote in message
> > > >> >> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
> > > >> >> > Hi Kalen,
> > > >> >> > Thanks for your suggestion.
> > > >> >> > I'm a little confused about the difference between Hash Match
and
> > > >> >> > Nested
> > > >> >> > Loops. As far as I learned from BOL, in Hash Match, the hash
> > values
> > > > are
> > > >> >> > moved from the base table to a new place in memory(called hash
> > > > table),
> > > >> >> > then
> > > >> >> > an operation like nested loop happens between hash table and
> > another
> > > >> >> > table.
> > > >> >> > In nested loops, no value is moved from the base table,
instead
> > the
> > > >> >> > loop
> > > >> >> > begins (with no hash table in between) directly with other
table.
> > > >> >> > It seems the only difference is the existence of hash table in
> > > > between,
> > > >> > is
> > > >> >> > that true?
> > > >> >> > Thanks again,
> > > >> >> > Leila
> > > >> >> >
> > > >> >> >
> > > >> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in
message
> > > >> >> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
> > > >> >> >> Hi Leila
> > > >> >> >>
> > > >> >> >> For your query tuning, it shouldn't matter what the actual
hash
> > > > values
> > > >> >> > are.
> > > >> >> >> If possible, you should try to build an index that will allow
SQL
> > > >> > Server
> > > >> >> > to
> > > >> >> >> perform a different join technique than hashing.
> > > >> >> >>
> > > >> >> >> Microsoft does not document any details of the hash functions
> > they
> > > > use
> > > >> >> >> for
> > > >> >> >> processing hash join operations. If you want to know more
about
> > > >> >> >> hashing
> > > >> >> >> in
> > > >> >> >> general, read "The Art of Computer Programming -- Volume 3:
> > Sorting
> > > >> >> >> and
> > > >> >> >> Searching" by Donald Knuth.
> > > >> >> >>
> > > >> >> >> --
> > > >> >> >> HTH
> > > >> >> >> --
> > > >> >> >> Kalen Delaney
> > > >> >> >> SQL Server MVP
> > > >> >> >> www.SolidQualityLearning.com
> > > >> >> >>
> > > >> >> >>
> > > >> >> >> "Leila" <lelas@.hotpop.com> wrote in message
> > > >> >> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
> > > >> >> >> > Hi,
> > > >> >> >> > In hash joins, how the hash value is computed? For example
in
> > > >> >> >> > this
> > > >> >> > query:
> > > >> >> >> >
> > > >> >> >> > SET SHOWPLAN_ALL ON
> > > >> >> >> > select c.customerid ,o.orderid, o.shipcountry from
> > > >> >> >> > customers c right outer join orders o
> > > >> >> >> > on c.customerid=o.customerid
> > > >> >> >> > and o.shipcountry='germany'
> > > >> >> >> >
> > > >> >> >> > How the fields those appear in HASH:() predicate help to
create
> > > > hash
> > > >> >> >> > values?
> > > >> >> >> > I think my problem is that I don't know that what the hash
> > value
> > > > is.
> > > >> >> >> > Thanks,
> > > >> >> >> > Leila
> > > >> >> >> >
> > > >> >> >> >
> > > >> >> >>
> > > >> >> >>
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >>
> > > >> >>
> > > >> >
> > > >> >
> > > >>
> > > >>
> > > >
> > > >
> > >
> > >
> >
> >
>|||You don't need to understand it to tune your queries.
If you want to understand what hashing is all about, I suggest you take a
look at the reference at the beginning of the thread, or use google to
search for generic informaiton about hashing.
A join key is a column in one table that is matched with a column in another
table, Both tables then have join keys.
It sounds like you're describing a 'join expression'.
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"Leila" <lelas@.hotpop.com> wrote in message
news:e6846sWoEHA.3792@.TK2MSFTNGP11.phx.gbl...
> Thanks Tibor!
> What I cannot understand is that what the meaning of "calculating hash
> value
> based on join key" is.
> Because join key is only the name of two fields plus an operator between
> them, it doesn't have any value itself (to be calculated).
> Leila
>
> "Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote
> in
> message news:eZi#TaWoEHA.1776@.TK2MSFTNGP14.phx.gbl...
>> For each row in the probe table, a hash value is calculated based on the
> join key. Then SQL Server
>> looks in the hash bucked from the build table to see if there is any
> match. The key (no pun
>> intended) here is that the build table is splitted up into a lot of
> buckets, and for the other
>> table, SQL server only have to look in a specific bucket to find if
> there's a match.
>> --
>> Tibor Karaszi, SQL Server MVP
>> http://www.karaszi.com/sqlserver/default.asp
>> http://www.solidqualitylearning.com/
>>
>> "Leila" <lelas@.hotpop.com> wrote in message
> news:uRd8FNWoEHA.3488@.TK2MSFTNGP12.phx.gbl...
>> > Does the hash table have an strucnture like index? If it doesn't, I
> think
>> > nested loop is inevitable for matching rows between hash table and the
> probe
>> > table.
>> >
>> >
>> >
>> >
>> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
>> > news:eQJ6HbRoEHA.2108@.TK2MSFTNGP10.phx.gbl...
>> > > A nested loop is when the inner table is processed completely for
> each
>> > row
>> > > of the outer table.
>> > >
>> > > For hash joins the inner table is read once to build the hash table,
> and
>> > > then not touched again. Then each row of the outer table leads to a
> single
>> > > access of the hash table.
>> > >
>> > > --
>> > > HTH
>> > > --
>> > > Kalen Delaney
>> > > SQL Server MVP
>> > > www.SolidQualityLearning.com
>> > >
>> > >
>> > > "Leila" <lelas@.hotpop.com> wrote in message
>> > > news:OfY1jORoEHA.3760@.TK2MSFTNGP12.phx.gbl...
>> > > > Kalen,
>> > > > When the hash table is ready, will there be something like nested
> loop
>> > to
>> > > > match rows? Because Mark described that the bottom table is
>> > > > scanned once (not in a nested loop).
>> > > > Leila
>> > > >
>> > > >
>> > > > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
>> > > > news:u26#jDRoEHA.2900@.TK2MSFTNGP12.phx.gbl...
>> > > >> The 'inner' table is whichever one is chosen by the SQL Server
>> > optimizer
>> > > > to
>> > > >> build the hash table. Typically this will be the smaller one, but
> not
>> > > >> always.
>> > > >> For BOL to say the smaller of the two is the build input is a bit
> of an
>> > > >> overgeneralization.
>> > > >>
>> > > >> --
>> > > >> HTH
>> > > >> --
>> > > >> Kalen Delaney
>> > > >> SQL Server MVP
>> > > >> www.SolidQualityLearning.com
>> > > >>
>> > > >>
>> > > >> "Leila" <lelas@.hotpop.com> wrote in message
>> > > >> news:eejiM5QoEHA.3788@.TK2MSFTNGP10.phx.gbl...
>> > > >> > Thanks Kalen!
>> > > >> > You mentioned 'the data in the inner table is organized into a
> hash
>> > > >> > table'.
>> > > >> > I read in BOL 'the smaller of the two inputs is the build
>> > > >> > input'.
>> > > >> > Are they different?
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
>> > > >> > news:ODudYpQoEHA.260@.TK2MSFTNGP10.phx.gbl...
>> > > >> >> The 'only' difference is a very expensive one.
>> > > >> >> If you have an index, SQL Server can take a value from the
>> > > >> >> outer
>> > table
>> > > >> >> and
>> > > >> >> use the index to find matching rows in the inner table.
>> > > >> >>
>> > > >> >> With a hash match, which is used because there IS no useful
> index,
>> > the
>> > > >> > data
>> > > >> >> in the inner table is organized into a hash table, so that SQL
>> > Server
>> > > > can
>> > > >> >> find matching rows using the hash table instead of an index.
>> > > >> >> Al though the inner table is scanned only once, the process of
>> > > >> >> building
>> > > >> > the
>> > > >> >> hash table is resource intensive, and the hash table uses a lot
> of
>> > > > memory
>> > > >> >> for a big table.
>> > > >> >>
>> > > >> >> You're better off building a good index to make the nested
>> > > >> >> loops
>> > > >> >> possible.
>> > > >> >>
>> > > >> >> --
>> > > >> >> HTH
>> > > >> >> --
>> > > >> >> Kalen Delaney
>> > > >> >> SQL Server MVP
>> > > >> >> www.SolidQualityLearning.com
>> > > >> >>
>> > > >> >>
>> > > >> >> "Leila" <lelas@.hotpop.com> wrote in message
>> > > >> >> news:%23t9lNSQoEHA.2340@.TK2MSFTNGP10.phx.gbl...
>> > > >> >> > Hi Kalen,
>> > > >> >> > Thanks for your suggestion.
>> > > >> >> > I'm a little confused about the difference between Hash Match
> and
>> > > >> >> > Nested
>> > > >> >> > Loops. As far as I learned from BOL, in Hash Match, the hash
>> > values
>> > > > are
>> > > >> >> > moved from the base table to a new place in memory(called
>> > > >> >> > hash
>> > > > table),
>> > > >> >> > then
>> > > >> >> > an operation like nested loop happens between hash table and
>> > another
>> > > >> >> > table.
>> > > >> >> > In nested loops, no value is moved from the base table,
> instead
>> > the
>> > > >> >> > loop
>> > > >> >> > begins (with no hash table in between) directly with other
> table.
>> > > >> >> > It seems the only difference is the existence of hash table
>> > > >> >> > in
>> > > > between,
>> > > >> > is
>> > > >> >> > that true?
>> > > >> >> > Thanks again,
>> > > >> >> > Leila
>> > > >> >> >
>> > > >> >> >
>> > > >> >> > "Kalen Delaney" <replies@.public_newsgroups.com> wrote in
> message
>> > > >> >> > news:euCtqpPoEHA.3460@.TK2MSFTNGP10.phx.gbl...
>> > > >> >> >> Hi Leila
>> > > >> >> >>
>> > > >> >> >> For your query tuning, it shouldn't matter what the actual
> hash
>> > > > values
>> > > >> >> > are.
>> > > >> >> >> If possible, you should try to build an index that will
>> > > >> >> >> allow
> SQL
>> > > >> > Server
>> > > >> >> > to
>> > > >> >> >> perform a different join technique than hashing.
>> > > >> >> >>
>> > > >> >> >> Microsoft does not document any details of the hash
>> > > >> >> >> functions
>> > they
>> > > > use
>> > > >> >> >> for
>> > > >> >> >> processing hash join operations. If you want to know more
> about
>> > > >> >> >> hashing
>> > > >> >> >> in
>> > > >> >> >> general, read "The Art of Computer Programming -- Volume 3:
>> > Sorting
>> > > >> >> >> and
>> > > >> >> >> Searching" by Donald Knuth.
>> > > >> >> >>
>> > > >> >> >> --
>> > > >> >> >> HTH
>> > > >> >> >> --
>> > > >> >> >> Kalen Delaney
>> > > >> >> >> SQL Server MVP
>> > > >> >> >> www.SolidQualityLearning.com
>> > > >> >> >>
>> > > >> >> >>
>> > > >> >> >> "Leila" <lelas@.hotpop.com> wrote in message
>> > > >> >> >> news:%23W3h0HPoEHA.324@.TK2MSFTNGP11.phx.gbl...
>> > > >> >> >> > Hi,
>> > > >> >> >> > In hash joins, how the hash value is computed? For example
> in
>> > > >> >> >> > this
>> > > >> >> > query:
>> > > >> >> >> >
>> > > >> >> >> > SET SHOWPLAN_ALL ON
>> > > >> >> >> > select c.customerid ,o.orderid, o.shipcountry from
>> > > >> >> >> > customers c right outer join orders o
>> > > >> >> >> > on c.customerid=o.customerid
>> > > >> >> >> > and o.shipcountry='germany'
>> > > >> >> >> >
>> > > >> >> >> > How the fields those appear in HASH:() predicate help to
> create
>> > > > hash
>> > > >> >> >> > values?
>> > > >> >> >> > I think my problem is that I don't know that what the hash
>> > value
>> > > > is.
>> > > >> >> >> > Thanks,
>> > > >> >> >> > Leila
>> > > >> >> >> >
>> > > >> >> >> >
>> > > >> >> >>
>> > > >> >> >>
>> > > >> >> >
>> > > >> >> >
>> > > >> >> >
>> > > >> >> >
>> > > >> >>
>> > > >> >>
>> > > >> >
>> > > >> >
>> > > >>
>> > > >>
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>

Tuesday, March 27, 2012

Computing hash values

No comments:

Post a Comment

Sql Comparing two tables

Blog Archive

About Me