Sql Comparing two tables: Complicated Update query based on existing data

OK folks, may have a tough one or perhaps just not thinking it through
well. I need to create a SQL query or queries that updates two columns
based on some business rules. Here's an example of the data:
GroupID complete_num first_num second_num InUse biggest
965423 1.0 1 0
965423 2.0 2 0
965423 3.0 3 0 X
965423 3.1 3 1
965423 3.2 3 2 X
324554 1.0 1 0
324554 2.0 2 0 X X
123456 0.1 0 1
123456 0.2 0 2 X X
Hopefully the above even vaguely lines up for you. The last two
columns are currently blank. The representation above is how I would
like them to look after the queries run. So for the above data, you
have a group id that links all records for one set together. I need to
have the "InUse" box updated with a value of "X" for the highest number
that has zero in the second_num column for a group. I also need the
biggest column set to "X" for the largest number in a particular group,
so 3.1 is larger than 3.0. Keep in mind, I have separated the
complete_num into two columns as there could be a "decimal" value of
"10" which is higher than "1". They are not the same as complete_num
is not really a decimal numeric representation. It's used
programatically for other things. Seperating into two separate columns
allows better sorting of data as complete_num is varchar and first and
second num are integer. You will also see the case, 123456,where there
is no zero in the second_num column. In this case, the highest
second_num value will have both set to "X". Any help would be
appreciated and SQL queries are prefered over any procedures/functions
as this is a one time query I need to run against the database. If you
need further clarification or more examples, please let me know.
Thanks.
JRAlso note that there is of course a unique incremental column in the
table. For arguements sake we can say U_ID. Also these records could
be in any order by default in the database. Not necessarily grouped
together as shown above.|||JR (jriker1@.yahoo.com) writes:
> OK folks, may have a tough one or perhaps just not thinking it through
> well. I need to create a SQL query or queries that updates two columns
> based on some business rules. Here's an example of the data:
> GroupID complete_num first_num second_num InUse biggest
> 965423 1.0 1 0
> 965423 2.0 2 0
> 965423 3.0 3 0 X
> 965423 3.1 3 1
> 965423 3.2 3 2 X
> 324554 1.0 1 0
> 324554 2.0 2 0 X X
> 123456 0.1 0 1
> 123456 0.2 0 2 X X
> Hopefully the above even vaguely lines up for you. The last two
> columns are currently blank. The representation above is how I would
> like them to look after the queries run. So for the above data, you
> have a group id that links all records for one set together. I need to
> have the "InUse" box updated with a value of "X" for the highest number
> that has zero in the second_num column for a group. I also need the
> biggest column set to "X" for the largest number in a particular group,
> so 3.1 is larger than 3.0.
It is always a good idea for this sort of question to include CREATE
TABLE statements for the table, and the sample data as INSERT statements.
That makes it easy to copy-and-paste into a query tool, to develop a
tested solution.
Thus, this is an untested solution:
BEGIN TRANSACTION
UPDATE tbl
SET biggest = 0,
inuse = 0
UPDATE tbl
SET biggest = 1
FROM tbl a
WHERE EXISTS (SELECT *
FROM (SELECT GroupID,
biggest = MAX(100000 * first_num + second_num)
FROM tbl
GROUP BY GroupID) AS big
WHERE a.big = big.GroupID
AND a.first_num * 1000000 + a.second_num = big.biggest)
UPDATE tbl
SET inuse = 1
FROM tbl a
JOIN (SELECT GroupID, first_num = MAX(first_num)
FROM tbl
WHERE second_num = 0
GROUP BY GroupID) AS inuse ON a.GroupID = inuse.GroupID
AND a.first_num = inuse.first_num
AND a.second_num = 0
UPDATE tbl
SET inuse = 1
FROM tbl a
WHERE a.biggest = 1
AND NOT EXISTS (SELECT *
FROM tbl b
WHERE a.GroupID = b.GroupID
AND a.insue = 1)
COMMIT TRANSACTION
--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Thanks Erland. SQL Query Analyser is complaining about a syntax
problem with the '=' on "biggest = MAX(100000 * first_num +
second_num)" in the second update and "JOIN (SELECT GroupID,
first_num = MAX(first_num)" in the ghird.|||JR,
I think you can say this more simply: Update InUse to X
for the largest second_num value in the group *counting 0
as the largest*, and the largest first_num value in the case of
a tie. Update biggest to X for the largest first_num value
in the group, and the largest second_num value in the case
of a tie.
There is a more compact solution than Erland's in SQL Server 2005:
with T2 as (
select
*,
rank() over (partition by GroupID order by first_num desc,
case when second_num = 0 then 2147364827 else second_num end desc)
as rk1,
rank() over (partition by GroupID order by second_num desc,
first_num desc) as rk2
from #T
)
update T2 set
InUse = case when rk1 = 1 then 'X' else InUse end,
biggest = case when rk2 = 1 then 'X' else biggest end
where rk1 = 1 or rk2 = 1
Steve Kass
Drew University
JR wrote:

>OK folks, may have a tough one or perhaps just not thinking it through
>well. I need to create a SQL query or queries that updates two columns
>based on some business rules. Here's an example of the data:
>GroupID complete_num first_num second_num InUse biggest
>965423 1.0 1 0
>965423 2.0 2 0
>965423 3.0 3 0 X
>965423 3.1 3 1
>965423 3.2 3 2 X
>324554 1.0 1 0
>324554 2.0 2 0 X X
>123456 0.1 0 1
>123456 0.2 0 2 X X
>
>Hopefully the above even vaguely lines up for you. The last two
>columns are currently blank. The representation above is how I would
>like them to look after the queries run. So for the above data, you
>have a group id that links all records for one set together. I need to
>have the "InUse" box updated with a value of "X" for the highest number
>that has zero in the second_num column for a group. I also need the
>biggest column set to "X" for the largest number in a particular group,
>so 3.1 is larger than 3.0. Keep in mind, I have separated the
>complete_num into two columns as there could be a "decimal" value of
>"10" which is higher than "1". They are not the same as complete_num
>is not really a decimal numeric representation. It's used
>programatically for other things. Seperating into two separate columns
>allows better sorting of data as complete_num is varchar and first and
>second num are integer. You will also see the case, 123456,where there
>is no zero in the second_num column. In this case, the highest
>second_num value will have both set to "X". Any help would be
>appreciated and SQL queries are prefered over any procedures/functions
>as this is a one time query I need to run against the database. If you
>need further clarification or more examples, please let me know.
>Thanks.
>JR
>
>|||On 9 Apr 2006 07:36:13 -0700, JR wrote:

>OK folks, may have a tough one or perhaps just not thinking it through
>well. I need to create a SQL query or queries that updates two columns
>based on some business rules. Here's an example of the data:
(snip)
Hi JR,
If these columns should be calculated based on the data, why store them
at all? I would only recommend that if updates are very infrequent,
queries of the data are very frequent and the table is large, or if
these columns need to capture a moment in time and stay unchanged after
that, even if the underlying data does change.
In one of the latter cases, use Erland's suggestion. Otherwise, drop the
columns from the table and set up a view instead:
CREATE VIEW ChooseGoodName
AS
SELECT a.GroupID, a.complete_num, a.first_num, a.second_num,
CASE WHEN a.first_num = b.first_num
AND a.second_num = 0
THEN 'X' -- Latest X.0
WHEN a.first_num = 0
AND a.second_num = b.second_num
THEN 'X' -- Latest 0.Y
ELSE ''
END AS InUse,
CASE WHEN b.first_num = a.first_num
AND b.second_num = a.second_num
THEN 'X'
ELSE ''
END AS biggest
FROM YourTable AS a
INNER JOIN (SELECT GroupID, complete_num, first_num, second_num
FROM YourTable AS c
WHERE NOT EXISTS (SELECT *
FROM YourTable AS d
WHERE d.GroupID = c.GroupID
AND ( d.first_num > c.first_num
OR ( d.first_num = c.first_num
AND d.second_num > c.second_num))
) ) AS b
ON b.GroupID = a.GroupID
(Untested - see www.aspfaq.com/5006 if you prefer a tested reply)
Hugo Kornelis, SQL Server MVP|||JR (jriker1@.yahoo.com) writes:
> Thanks Erland. SQL Query Analyser is complaining about a syntax
> problem with the '=' on "biggest = MAX(100000 * first_num +
> second_num)" in the second update and "JOIN (SELECT GroupID,
> first_num = MAX(first_num)" in the ghird.
Yes, as I said the code was untested for reasons I explained. I assume
that you are able to weed out trivial syntax errors on your own. If not,
please include CREATE TABLE statements and the sample data in INSERT
statements, to make testing easy.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||As requested, here is a table creation script and some minimal data to
limit the length of the message. Also Hugo's view is great however
since this data is for one time use, would imagine updates to the
existing data would be preferable over introducing a new view into the
mix. If not updating the data directly based on the view would now be
a simple matter when you introduce the Id from the original table into
the view.
CREATE TABLE [abcd].[dbo].[TBL1
(Id,GroupId,complete_num,first_num,secon
d_num)] (
[Id] int NOT NULL,
[GroupID] nvarchar (60) NULL,
[complete_num] varchar (255) NULL,
[first_num] integer,
[second_num] integer,
[InUse] varchar (2) NULL,
[biggest] varchar (2) NULL,
)
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(91510,ABC1235,2.0,2,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(89377,ABC1235,2.1,2,1);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(89371,ABC1235,2.2,2,2);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1310,M123456,1.0,1,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1309,M123456,2.0,2,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1311,M123456,3.0,3,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1312,M123456,4.0,4,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1315,M123456,5.0,5,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1318,M123456,6.0,6,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1319,M123456,7.0,7,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1317,M123456,8.0,8,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(5342,M123456,9.0,9,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(5346,M123456,10.0,10,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(5756,M123456,11.0,11,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(6315,M123456,12.0,12,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(6604,M123456,13.0,13,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(6920,M123456,14.0,14,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(1002,M123456,15.0,15,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES
(4023,ARDFO32,0.1,0,1);|||Your programs will be total nightmares and crap until you learn how to
design a schema.
Look at the DDL; do you really have a NCHAR(60) grpoup identifier? In
Chinese' Wel;l;, since you allowed it, you will get one! Why do you
have redundant split attributes (i.e. first_num || second_num =
complete_num)? Let's spit on normalization!! I also love the clear,
meaningful names of the data elements. Tell us what a thing is, not
its sequential order inside another column. Logical not physical
descriptions.
There are no keys, no constraints. This is not a table at all! And
you invented your own syntax for CREATE TABLE.
Your idea of updatind computed columns is a way to mimic punch cards.
Back in the 1950-60's we had to store those things in the physical
card, like you are doing now.
Making a guess, if you normalized your schema, had a key and followed
the baisc data modeling rules, would this nightmare look more like
this? Better names that show subordination (well, Foobar is a dummy
name, but that is all you gave us)
CREATE TABLE Foobar
(group_id CHAR(6) NOT NULL
CHECK (group_id LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]')
section_nbr INTEGER DEFAULT 1 NOT NULL
CHECK (section_num >= 0),
subsection_nbr INTEGER DEFAULT 0 NOT NULL
CHECK (subsection_num >= 0),
PRIMARY KEY (group_id, section_nbr, subsection_nbr));
Do you need a constaint to assure that the subsections are in sequence?
Is there a check digit rule in the group_id? 90% of the work in RDBMS
is done in the DDL!!
SQL does not have links; it has REFERENCES and grouping. Totally
different concepts, based on sets and not pre-RDBMS file and pointer
systems. Rows are nothing whatsoever like records.
Now, the answer to your question is a VIEW, not a "punch cards and bit
flags" solution via updates.
CREATE VIEW InUseFoobar (group_id, section_nbr, subsection_nbr)
AS
SELECT group_id, MAX(section_nbr), 0
FROM Foobar
WHERE subsection = 0
GROUP BY group_id) ;
CREATE VIEW MaxFoobar (group_id, section_nbr, subsection_nbr)
AS
SELECT group_id, section_nbr, MAX(subsection_nbr)
FROM Foobar AS F1, InUseFoobar AS U1
WHERE F1.group_id = .U1.group_id
AND F1.section_nbr = .U1.section_nbr
GROUP BY group_id, section_nbr ;
<<untested>>|||JR (jriker1@.yahoo.com) writes:
> As requested, here is a table creation script and some minimal data to
> limit the length of the message. Also Hugo's view is great however
> since this data is for one time use, would imagine updates to the
> existing data would be preferable over introducing a new view into the
> mix. If not updating the data directly based on the view would now be
> a simple matter when you introduce the Id from the original table into
> the view.
Below is a tested version of my script. I'm not sure that I understand
the syntax errors you mentioned; I did not get these. I include
your original script, with small changes. Note that I've made inuse
and biggest into bit columns; I did this as I used 0 and 1 in my script.
Beware that things get wrapped in news transport!
CREATE TABLE [dbo].[TBL1]
(
[Id] int NOT NULL,
[GroupId] nvarchar (60) NULL,
[complete_num] varchar (255) NULL,
[first_num] integer,
[second_num] integer,
[inuse] bit NULL,
[biggest] bit NULL,
)
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (9151
0,'ABC1235',2.0,2,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (8937
7,'ABC1235',2.1,2,1);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (8937
1,'ABC1235',2.2,2,2);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1310
,'M123456',1.0,1,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1309
,'M123456',2.0,2,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1311
,'M123456',3.0,3,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1312
,'M123456',4.0,4,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1315
,'M123456',5.0,5,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1318
,'M123456',6.0,6,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1319
,'M123456',7.0,7,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1317
,'M123456',8.0,8,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (5342
,'M123456',9.0,9,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (5346
,'M123456',10.0,10,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (5756
,'M123456',11.0,11,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (6315
,'M123456',12.0,12,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (6604
,'M123456',13.0,13,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (6920
,'M123456',14.0,14,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (1002
,'M123456',15.0,15,0);
INSERT INTO TBL1 (Id,GroupId,complete_num,first_num,secon
d_num) VALUES (4023
,'ARDFO32',0.1,0,1);
go
BEGIN TRANSACTION
UPDATE TBL1
SET biggest = 0,
inuse = 0
UPDATE TBL1
SET biggest = 1
FROM TBL1 a
WHERE EXISTS (SELECT *
FROM (SELECT GroupId,
biggest = MAX(100000 * first_num + second_num)
FROM TBL1
GROUP BY GroupId) AS big
WHERE a.GroupId = big.GroupId
AND a.first_num * 100000 + a.second_num = big.biggest)
UPDATE TBL1
SET inuse = 1
FROM TBL1 a
JOIN (SELECT GroupId, first_num = MAX(first_num)
FROM TBL1
WHERE second_num = 0
GROUP BY GroupId) AS inuse ON a.GroupId = inuse.GroupId
AND a.first_num = inuse.first_num
AND a.second_num = 0
UPDATE TBL1
SET inuse = 1
FROM TBL1 a
WHERE a.biggest = 1
AND NOT EXISTS (SELECT *
FROM TBL1 b
WHERE a.GroupId = b.GroupId
AND b.inuse = 1)
COMMIT TRANSACTION
SELECT * FROM TBL1
go
drop table TBL1
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx

Sunday, March 11, 2012

Complicated Update query based on existing data

No comments:

Post a Comment

Sql Comparing two tables

Blog Archive

About Me