Sql Comparing two tables: Complex Query help needed fast!

The goal is to take the low and high values out of each record, then get the
average of the remaining fields. For ex, ID1 has five values, remove 2 and
100 then add (40+20+10)/3...then do this for each record. The number of
values in the five fields can vary as shown below.
I have a table that looks like this:
ID F1 F2 F3 F4 F5 field names
1 100 40 20 2 10 Values
2 4 140 10 42
3 10 189 22
4 20 10 24 332 3
Thanks in advance for this query.One way is to use the unpivot operator. It's tested in MSSQL2005.
select id, (sum(f) - max(f) - min(f))/(count(*) - 2.0) avg
from (select id, f from test unpivot (f for t in (f1,f2,f3,f4,f5)) p) t
group by id
"KT" <ktdev@.hotmail.com> wrote in message
news:%23RBOH3BPGHA.456@.TK2MSFTNGP15.phx.gbl...
> The goal is to take the low and high values out of each record, then get
> the
> average of the remaining fields. For ex, ID1 has five values, remove 2
> and
> 100 then add (40+20+10)/3...then do this for each record. The number of
> values in the five fields can vary as shown below.
>
>
> I have a table that looks like this:
>
> ID F1 F2 F3 F4 F5 field names
> 1 100 40 20 2 10 Values
> 2 4 140 10 42
> 3 10 189 22
>
> 4 20 10 24 332 3
>
>
> Thanks in advance for this query.|||See if this helps. I'm assuming the blank column values contain NULL.
If they contain something else, you'll have to change F1 through F5 to
CASE WHEN F1 = <whatever the blank is> THEN NULL ELSE F1 END,
and so on.
select
ID, (SUM(Fval)-MAX(Fval)-MIN(Fval))/(COUNT(Fval)-2) as trimMean
from (
select
ID,
ColNum,
case ColNum
when 1 then F1
when 2 then F2
when 3 then F3
when 4 then F4
when 5 then F5
end*1.0
from yourTable
cross join (
select 1 union all select 2 union all
select 3 union all select 4 union all select 5
) as F(ColNum)
) as T(ID,ColNum,Fval)
group by ID
go
Steve Kass
Drew University
KT wrote:

>The goal is to take the low and high values out of each record, then get th
e
>average of the remaining fields. For ex, ID1 has five values, remove 2 and
>100 then add (40+20+10)/3...then do this for each record. The number of
>values in the five fields can vary as shown below.
>
>
>I have a table that looks like this:
>
>ID F1 F2 F3 F4 F5 field names
>1 100 40 20 2 10 Values
>2 4 140 10 42
>3 10 189 22
>
>4 20 10 24 332 3
>
>
>Thanks in advance for this query.
>
>|||My solutions :) For the 2005 version, I like my solution for readability as
it is pretty straightforward. If you aren't using 2005, then Kass's is
really slick. Don't get me wrong, the person named mason's solution is
slicker than mine (and shorter) but I think that my solution is probably
more understandable later in the process. Either way, one of these'll do
you :)
Note that I use integer math, while Steve's uses floats. So my answers are
rounded off , while his arent
--create the table
create table looksLikeThis
(
id int primary key,
f1 int,
f2 int,
f3 int,
f4 int,
f5 int
)
insert into looksLikeThis
select 1, 100, 40, 20, 2, 10
union all
select 2, 4, 140, 10, 42, null
union all
select 3, 10, 189, 22,null, null
union all
select 4, 20, 10, 24, 332, 3
go
---
-- In 2005
---
--so much easier to do with the partition statement and the CTE. Allows the
first two
--views to be rolled up into one query pretty easy:
with shouldLookLikeThis as
(select *, row_number() over (partition by id order by value,uniqueifier )
as ordering
from (
select id, f1 as value, 1 as uniqueifier
from looksLikeThis
union all
select id, f2, 2
from looksLikeThis
union all
select id, f3, 3
from looksLikeThis
union all
select id, f4, 4
from looksLikeThis
union all
select id, f5, 5
from looksLikeThis ) as denorm
where value is not null)
select id, avg(value) as averageValue
from shouldLookLikeThis
where ordering not in (select max(ordering)
from shouldLookLikeThis s2
where s2.id = shouldLookLikeThis.id)
and ordering not in (select min(ordering)
from shouldLookLikeThis s2
where s2.id = shouldLookLikeThis.id)
group by id
-- Using 2000 and recent versions
--
--normalize the table, including some value to make sure of uniqueness (very
important to the query
--so you don't lose rows if the min or max have multiple values
create view shouldLookLikeThis
as
select *
from (
select id, f1 as value, 1 as uniqueifier
from looksLikeThis
union all
select id, f2, 2
from looksLikeThis
union all
select id, f3, 3
from looksLikeThis
union all
select id, f4, 4
from looksLikeThis
union all
select id, f5, 5
from looksLikeThis ) as denorm
where value is not null
go
--this view adds an ordering to the output (this is what the uniqueifier was
about
create view includeOrder
as
select id, value, (select count(*)
from shouldLookLikeThis s2
where shouldLookLikeThis.id = s2.id
and (shouldLookLikeThis.value <= s2.value
or (shouldLookLikeThis.value = s2.value
and shouldLookLikeThis.uniqueifier <=
s2.uniqueifier))
) as ordering
from shouldLookLikeThis
GO
--then just exclude the min and max orderwise
select id, avg(value) as averageValue
from includeOrder
where ordering not in (select max(ordering)
from includeOrder s2
where s2.id = includeOrder.id)
and ordering not in (select min(ordering)
from includeOrder s2
where s2.id = includeOrder.id)
group by id
go
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)
"KT" <ktdev@.hotmail.com> wrote in message
news:%23RBOH3BPGHA.456@.TK2MSFTNGP15.phx.gbl...
> The goal is to take the low and high values out of each record, then get
> the
> average of the remaining fields. For ex, ID1 has five values, remove 2
> and
> 100 then add (40+20+10)/3...then do this for each record. The number of
> values in the five fields can vary as shown below.
>
>
> I have a table that looks like this:
>
> ID F1 F2 F3 F4 F5 field names
> 1 100 40 20 2 10 Values
> 2 4 140 10 42
> 3 10 189 22
>
> 4 20 10 24 332 3
>
>
> Thanks in advance for this query.
>|||I'm still learning MSSQL2005 features and the pivot/unpivot operators are an
interesting implementation. The idea is to transpose the rows into a column
so that we can use aggregate functions to calculate the average. If
"unpivot" is too unconventional, we can always use a case-based or a "union
all" derived table to achieve the same effect. I should add HAVING
COUNT(*)>2 to take care of the devide-by-0 situation.
"Louis Davidson" <dr_dontspamme_sql@.hotmail.com> wrote in message
news:eLIh84CPGHA.420@.tk2msftngp13.phx.gbl...
> My solutions :) For the 2005 version, I like my solution for readability
> as it is pretty straightforward. If you aren't using 2005, then Kass's is
> really slick. Don't get me wrong, the person named mason's solution is
> slicker than mine (and shorter) but I think that my solution is probably
> more understandable later in the process. Either way, one of these'll do
> you :)
> Note that I use integer math, while Steve's uses floats. So my answers
> are rounded off , while his arent
> --create the table
> create table looksLikeThis
> (
> id int primary key,
> f1 int,
> f2 int,
> f3 int,
> f4 int,
> f5 int
> )
> insert into looksLikeThis
> select 1, 100, 40, 20, 2, 10
> union all
> select 2, 4, 140, 10, 42, null
> union all
> select 3, 10, 189, 22,null, null
> union all
> select 4, 20, 10, 24, 332, 3
> go
> ---
> -- In 2005
> ---
> --so much easier to do with the partition statement and the CTE. Allows
> the first two
> --views to be rolled up into one query pretty easy:
> with shouldLookLikeThis as
> (select *, row_number() over (partition by id order by value,uniqueifier )
> as ordering
> from (
> select id, f1 as value, 1 as uniqueifier
> from looksLikeThis
> union all
> select id, f2, 2
> from looksLikeThis
> union all
> select id, f3, 3
> from looksLikeThis
> union all
> select id, f4, 4
> from looksLikeThis
> union all
> select id, f5, 5
> from looksLikeThis ) as denorm
> where value is not null)
> select id, avg(value) as averageValue
> from shouldLookLikeThis
> where ordering not in (select max(ordering)
> from shouldLookLikeThis s2
> where s2.id = shouldLookLikeThis.id)
> and ordering not in (select min(ordering)
> from shouldLookLikeThis s2
> where s2.id = shouldLookLikeThis.id)
> group by id
> --
> -- Using 2000 and recent versions
> --
> --normalize the table, including some value to make sure of uniqueness
> (very important to the query
> --so you don't lose rows if the min or max have multiple values
> create view shouldLookLikeThis
> as
> select *
> from (
> select id, f1 as value, 1 as uniqueifier
> from looksLikeThis
> union all
> select id, f2, 2
> from looksLikeThis
> union all
> select id, f3, 3
> from looksLikeThis
> union all
> select id, f4, 4
> from looksLikeThis
> union all
> select id, f5, 5
> from looksLikeThis ) as denorm
> where value is not null
> go
> --this view adds an ordering to the output (this is what the uniqueifier
> was about
> create view includeOrder
> as
> select id, value, (select count(*)
> from shouldLookLikeThis s2
> where shouldLookLikeThis.id = s2.id
> and (shouldLookLikeThis.value <= s2.value
> or (shouldLookLikeThis.value = s2.value
> and shouldLookLikeThis.uniqueifier <=
> s2.uniqueifier))
> ) as ordering
> from shouldLookLikeThis
> GO
> --then just exclude the min and max orderwise
> select id, avg(value) as averageValue
> from includeOrder
> where ordering not in (select max(ordering)
> from includeOrder s2
> where s2.id = includeOrder.id)
> and ordering not in (select min(ordering)
> from includeOrder s2
> where s2.id = includeOrder.id)
> group by id
> go
>
> --
> ----
--
> Louis Davidson - http://spaces.msn.com/members/drsql/
> SQL Server MVP
> "Arguments are to be avoided: they are always vulgar and often
> convincing."
> (Oscar Wilde)
> "KT" <ktdev@.hotmail.com> wrote in message
> news:%23RBOH3BPGHA.456@.TK2MSFTNGP15.phx.gbl...|||No, the unpivot thing isn't what is unconventional. It is the:
sum(f) - max(f) - min(f))/(count(*) - 2.0)
part that makes the brain work harder than I cared to think about last
night. This is actually a more elagant solution too because it handles
duplicates easier.

> The idea is to transpose the rows into a column so that we can use
> aggregate functions to calculate the average
This is because SQL works well with rows, not vectors, particularly not of
variable length like this. This is part of what the Basically all we are
doing is rotating the set to be a SQL table in first normal form which then
makes it natural.
I know might go with mason's solution over mine, but that is so not easy to
admit :)
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)
"mason" <masonliu@.msn.com> wrote in message
news:et31s%23GPGHA.3460@.TK2MSFTNGP15.phx.gbl...
> I'm still learning MSSQL2005 features and the pivot/unpivot operators are
> an interesting implementation. The idea is to transpose the rows into a
> column so that we can use aggregate functions to calculate the average. If
> "unpivot" is too unconventional, we can always use a case-based or a
> "union all" derived table to achieve the same effect. I should add HAVING
> COUNT(*)>2 to take care of the devide-by-0 situation.
>
> "Louis Davidson" <dr_dontspamme_sql@.hotmail.com> wrote in message
> news:eLIh84CPGHA.420@.tk2msftngp13.phx.gbl...
>|||And 2.0 rather than 2 floats the whole thing. :p
In reality, it's often quantity over quality in people's eyes, and you are
right that readability may go with quantity in many cases. :p
"Louis Davidson" <dr_dontspamme_sql@.hotmail.com> wrote in message
news:%23$CSpDJPGHA.3360@.TK2MSFTNGP09.phx.gbl...
> No, the unpivot thing isn't what is unconventional. It is the:
> sum(f) - max(f) - min(f))/(count(*) - 2.0)
> part that makes the brain work harder than I cared to think about last
> night. This is actually a more elagant solution too because it handles
> duplicates easier.
>
> This is because SQL works well with rows, not vectors, particularly not of
> variable length like this. This is part of what the Basically all we are
> doing is rotating the set to be a SQL table in first normal form which
> then makes it natural.
> I know might go with mason's solution over mine, but that is so not easy
> to admit :)
> --
> ----
--
> Louis Davidson - http://spaces.msn.com/members/drsql/
> SQL Server MVP
> "Arguments are to be avoided: they are always vulgar and often
> convincing."
> (Oscar Wilde)
> "mason" <masonliu@.msn.com> wrote in message
> news:et31s%23GPGHA.3460@.TK2MSFTNGP15.phx.gbl...|||Until performance gets involved, it is almost always the case :)
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)
"mason" <masonliu@.msn.com> wrote in message
news:uJEVfUJPGHA.3144@.TK2MSFTNGP11.phx.gbl...
> And 2.0 rather than 2 floats the whole thing. :p
> In reality, it's often quantity over quality in people's eyes, and you are
> right that readability may go with quantity in many cases. :p
>
> "Louis Davidson" <dr_dontspamme_sql@.hotmail.com> wrote in message
> news:%23$CSpDJPGHA.3360@.TK2MSFTNGP09.phx.gbl...
>|||This one worked perfectly...could you elaborate on what it is actually
doing.
Thanks Steve.
"Steve Kass" <skass@.drew.edu> wrote in message
news:Oz3EseCPGHA.2704@.TK2MSFTNGP15.phx.gbl...
> See if this helps. I'm assuming the blank column values contain NULL.
> If they contain something else, you'll have to change F1 through F5 to
> CASE WHEN F1 = <whatever the blank is> THEN NULL ELSE F1 END,
> and so on.
> select
> ID, (SUM(Fval)-MAX(Fval)-MIN(Fval))/(COUNT(Fval)-2) as trimMean
> from (
> select
> ID,
> ColNum,
> case ColNum
> when 1 then F1
> when 2 then F2
> when 3 then F3
> when 4 then F4
> when 5 then F5
> end*1.0
> from yourTable
> cross join (
> select 1 union all select 2 union all
> select 3 union all select 4 union all select 5
> ) as F(ColNum)
> ) as T(ID,ColNum,Fval)
> group by ID
> go
> Steve Kass
> Drew University
> KT wrote:
>
the
and|||The cross join turns each single row such as
ID F1 F2 F3 F4 F5
--
101 13 24 35 46 NULL
into five separate rows like this:
ID, ColNum, Fval
--
101 1 13
101 2 24
101 3 35
101 4 46
101 5 NULL
Then it groups over ID values, finding the sum of
the Fval values, the number of those values that are
not blank (count(Fval)), and the largest and smallest
of the non-blank values. It gets the average you
need by adding the non-blank values, subtracting the
largest and smallest, and dividing by two less than
the number of non-blank values, all of this for each ID.
To understand it better, evaluate these queries separately
(there may be typos, but the idea is to look at it step by
step)
-- 1
select * from (
select 1 union all select 2 union all
select 3 union all select 4 union all select 5
) as F(ColNum)
-- 2
select * from (
select
ID,
ColNum,
case ColNum
when 1 then F1
when 2 then F2
when 3 then F3
when 4 then F4
when 5 then F5
end*1.0
from yourTable
cross join (
select 1 union all select 2 union all
select 3 union all select 4 union all select 5
) as F(ColNum)
) as T(ID,ColNum,Fval)
order by ID, ColNum, Fval
-- 3
select
ID
COUNT(Fval) as count_fval,
SUM(Fval) as sum_fval,
MAX(Fval) as max_fval,
MIN(Fval) as min_fval
from (
select
ID,
ColNum,
case ColNum
when 1 then F1
when 2 then F2
when 3 then F3
when 4 then F4
when 5 then F5
end*1.0
from yourTable
cross join (
select 1 union all select 2 union all
select 3 union all select 4 union all select 5
) as F(ColNum)
) as T(ID,ColNum,Fval)
group by ID
-SK
KT wrote:

>This one worked perfectly...could you elaborate on what it is actually
>doing.
>Thanks Steve.
>"Steve Kass" <skass@.drew.edu> wrote in message
>news:Oz3EseCPGHA.2704@.TK2MSFTNGP15.phx.gbl...
>
>the
>
>and
>
>
>

Saturday, February 25, 2012

Complex Query help needed fast!

No comments:

Post a Comment

Sql Comparing two tables

Blog Archive

About Me