Sql Comparing two tables: complicated update (for me)

hello everyone,
just having trouble with an update statement. so i started off with a
table like so:
create table foo (
id uniqueidentifier not null primary key nonclustered,
name varchar(50) not null,
type int not null,
datecreated datetime not null default getdate(),
constraint foo1 unique (name)
)
go
then i added a new column, like so:
alter table foo add column rank int null
go
and here's where the tricky update comes in. my goal is to populate
this new "rank" column with incremental values for each distinct "type"
value.
for example, i the original table looks like so:
select name, type from foo
go
name type
-- --
john 1
jane 1
jim 1
jake 2
jeff 2
kyle 2
keli 2
kim 2
i would want the update statement to populate the table so that AFTER
the update, it looks like so:
select name, type, rank from foo
name type rank
-- -- --
john 1 1
jane 1 2
jim 1 3
jake 2 1
jeff 2 2
kyle 2 3
keli 2 4
kim 2 5
it's important to note that the original "rank" really doesn't matter,
it is just something that needs to be kept track of moving forward, so
i don't even particularly have to rank them alphabetically, or any
other way. just need an update statement that can put in incremental
values, but that sort of "resets" the incrementation for each changing
value of another field.
if that's possible.
thanks in advance for any help!
jasonYou have a few problems with this. First, a uniqueidentifier cannot be
a relational key, so this is not a table by definition. Where does it
occur in the reality of the data model? Next, Rows are not records;
fields are not columns; tables are not files. Totally differerent
concepts. You are building a sequential file in SQL
DEFAULT comes after the data type in STANDARD SQL, and you can use
CURRENT_TIMESTAMP instead of the proprietary getdate(). But you should
not put audit trail information in the table (ask your accountant about
proper procedures).
There is no sequential access or ordering in an RDBMS, so "first",
"next" and "last" are totally meaningless. So what are you using to
assign the rank values? The data model should have some rule so you
can validate the data.
Granted this is a sample table, but all of the column names are
incomplete (type of what? Name of what? Etc.) Your sample design
should look more like this:
CREATE TABLE NewFoo
(foo_name VARCHAR(50) NOT NULL PRIMARY KEY,
foo_type INTEGER NOT NULL,
bar_rank INTEGER NOT NULL,
UNIQUE (foo_type, bar_rank)); -- is this the PK?
Now, let's kill the old one and get things into an RDBMS:
INSERT INTO NewFoo (foo_name, foo_type, bar_rank)
SELECT name, type,
(SELECT COUNT(F1.*)
FROM Foo AS F1
WHERE F1.type = Foo.type
AND F1.name = Foo.name)
FROM Foo;
Then drop Foo and rename NewFoo.|||> First, a uniqueidentifier cannot be a relational key, so this
> is not a table by definition. Where does it occur in the
> reality of the data model?
i call bs on this one, CELKO. a uniqueidentifier is a perfectly valid
artificial key for any record. what makes you think it can't be a
relational key?

> But you should not put audit trail information in the table
i call bs on this one too, i'm afraid. row-level auditing is a
perfectly valid practice for a number of reasons.
as for the rest, yeah, i was lazy, i didn't actually try the create
table statement before posting :) and i'll try to be more descriptive
in future examples.
as for the rank, it's definitely not an attempt to make the table a
sequential file, so you guessed wrong there i'm afraid. i use clustered
indexes if order is important for something, which has specific,
uncommon purposes in my opinion.
the rank is a meaningful value that a subscriber application needs to
know. you can think of it like your standing in a contest, you're
either in first, second, third, etc. place. the rank can't actually be
determined by any values in the database, it is determined by the
providers of the data, and we just store the rank. it's meaningful, not
physically sequential, etc.
and lastly, regarding the insert statement: THANK YOU. exactly what i
was looking for. i look forward to trying it out when i get back to the
office tomorrow.
thankful as always,
jason|||>> call bs on this one, CELKO. a uniqueidentifier is a perfectly valid artificial
key for any record [sic]. what makes you think it can't be a relational key? <<
The very definition of a relationall key, the stuff Dr. Codd wrote and
basic data model concepts. A relational key is subset of the
attributes of an entity that is unique. A uniqueidentifier is derived
from physical storage and has nothing whatsoever to do with the entity.
Newbies who do not know that a row (logical construct) is nothing like
a record (physical storage) constantly make this mistake and build file
systems in SQL.
An artificial key has to have validation and verification rules, and a
uniqueidentifier does not.
That is fine, but putting the audit trail into the table that is being
audited is not a proper accounting practice. The changes need to be
caught outside of the table. Talk to the accounting department or the
SOX guy for your company. This is like letting developers do their own
QA.|||assuming (name, type) is unique
update foo set rank = (select count(*) from foo f1 where f1.type =
foo.type and f1.name<foo.name)+1
if (name, type) is NOT unique, it it still doable but more complex|||> A uniqueidentifier is derived from physical storage
well, I guess whoever says this at a job interview is less likely to
get hired ;)|||> A relational key is subset of the
> attributes of an entity that is unique.
actually, if i'm reading you correctly, that's a called NATURAL key.
relational keys do not have the necessary condition of being natural
elements of an entity. the only necessary condition of a relational key
is that it be a column or columns whose values are gauranteed to be
unique across all occurrences in a given table. that's it.
by this definition, a relational key can be natural OR artificial. what
you're describing is a natural relational key, and good for you, that's
totally fine. and so are artificial relational keys.

> An artificial key has to have validation and verification rules, and a
> uniqueidentifier does not.
the only "validation and verification" rule required to act as a
relational key is that it be UNIQUE. and uniqueidentifiers, when
properly used, are certainly that.
i presume that you would have just as many objections about using an
identity integer as a relational key? if that's true, then you're
grossly misrepresenting your argument. you're not arguing the
definition of a relational key, you're arguing the validity of
artificial versus natural keys AS relational keys. totally different
argument.

> That is fine, but putting the audit trail into the table that is being
> audited is not a proper accounting practice.
i don't see your logic here. what is the difference between attaching
such a column to the entity it is auditing and putting it in another
table, and relating it to the entity it is auditing? the only
difference i can think of is that you could apply different user
permissions to each table. that's fine and well, but there are plenty
of other places to handle security, and other considerations, such as
performance.|||this worked like a charm, thank you very much!
jason|||yeah, i wasn't sure where that was coming from either. aren't they
derived from like a bunch of crazy variables? datetime, cpu serial
number, mac address, your mother's maiden name, the position of the
every valence electron in your body ...|||>> what is the difference between attaching such a column to the entity it i
s auditing and putting it in another table, and relating it to the entity it
is auditing? <<
You do not have to put the audit information in the schema at all. It
can be in an external file system or other RDBMS.
Separation is a basic accounting principle, like double entry
bookkeeping. For example, when I submit an article to a publisher, I
send the editor one copy of the invoice and another copy to Accounts
Payable. The A/P clerk has to match both copies of the invoice before
they issue a check to me.
My editor deletes their copies of my invoice. The A/P clerk now has an
invoice without a mate at the end of the payment cycle, so they know to
start calling people.
The A/P department deletes their copies of my invoice. The editor now
has an invoice which was no paid at the end of the payment cycle, so
they know to call the A/P department.
Both editor and A/P delete their copies of my invoice. Accounting sees
a missing invoice number at the end of the payment cycle, because they
designed an invoice number that can be validated and verified rather
than a meaningless, hardware generated number. Accounting makes life
hell for everyone until they can trace that missing invoice number.
.

Sunday, March 11, 2012

complicated update (for me)

No comments:

Post a Comment

Sql Comparing two tables

Blog Archive

About Me