Thursday, March 8, 2012

complex(?) query

Hello experts.

I'm a novice sql writer and need some help in writing a query to
extract applicable data from the following table (titled EMPLOYEE):

--
ID_NUMBERCODEDATE
------ --- ---
12 VO20060914
12 XD20060913
12 AD20060912
12 WR20060911
12 AT20060910
45 VO20060914
45 XR20060913
45 AT20060912
45 AD20060911
45 AT20060910
78 AD20060914
78 AT20060913
78 VO20060912
78 AD20060911
78 AT20060910

I need to select ID_NUMBER
from EMPLOYEE
where CODE = 'VO'

caveat: I only want the ID_NUMBER(s) where the CODE = 'VO'
and the previous CODE (by DATE) = 'AD'
or the previous CODE (by DATE) = 'AD' with any CODE in between
except 'AT';

E.g., in the above example, the appropriate code should select
ID_NUMBER(s) 12 and 78 because
1. a VO code exists
2. an AD code (by DATE) precedes it
3. although 'AD' does not come immediately before 'VO' (in the
case of ID_NUMBER 12) 'AT' cannot be found in between

I hope I haven't confused anyone. Any help would be appreciated.Alex,

You seem to be almost there and you just have to put it all together to
come up with the answer.
To help you along, let's talk syntax and "NOT EXISTS"

1. Are you familiar with the syntax of a join that allows you to
select a subset of data before using it in the join.
As this applies to your case:

SELECT col1, col2, col3
FROM ( SELECT col1, col2, col3
FROM codes
WHERE code = '?' ) <tbl_alias>
INNER JOIN (SELECT col1, col2, col3
FROM codes
WHERE code = '?' ) <tbl_alias2>
ON tbl_alias1.col? = tbl_alias2.col?
< Add more join conditions >

2. NOT EXISTS is useful in your example. What do you want to NOT
EXIST?

Quote:

Originally Posted by

3. although 'AD' does not come immediately before 'VO'


(in the

Quote:

Originally Posted by

case of ID_NUMBER 12) 'AT' cannot be found in between


AND NOT EXISTS ( SELECT 1 FROM codes WHERE ? )
Remember in the N/E sub_select you will be comparing values in a
correlated sub_query.

I'll check back and post the answer if you are still having difficulty
but only after you post some sql that you have been trying to make
work.

alex wrote:

Quote:

Originally Posted by

Hello experts.
>
I'm a novice sql writer and need some help in writing a query to
extract applicable data from the following table (titled EMPLOYEE):
>
--
ID_NUMBERCODEDATE
------ --- ---
12 VO20060914
12 XD20060913
12 AD20060912
12 WR20060911
12 AT20060910
45 VO20060914
45 XR20060913
45 AT20060912
45 AD20060911
45 AT20060910
78 AD20060914
78 AT20060913
78 VO20060912
78 AD20060911
78 AT20060910
>
I need to select ID_NUMBER
from EMPLOYEE
where CODE = 'VO'
>
caveat: I only want the ID_NUMBER(s) where the CODE = 'VO'
and the previous CODE (by DATE) = 'AD'
or the previous CODE (by DATE) = 'AD' with any CODE in between
except 'AT';
>
E.g., in the above example, the appropriate code should select
ID_NUMBER(s) 12 and 78 because
1. a VO code exists
2. an AD code (by DATE) precedes it
3. although 'AD' does not come immediately before 'VO' (in the
case of ID_NUMBER 12) 'AT' cannot be found in between
>
I hope I haven't confused anyone. Any help would be appreciated.

|||Alex,

Let me update my previous post.
The subselects within the join are not necessary.
It is the NOT EXISTS that you should be interested in.
You will still have to join the table with itself to filter out the
rows for VO and AD within two separate aliased tables.

alex wrote:

Quote:

Originally Posted by

Hello experts.
>
I'm a novice sql writer and need some help in writing a query to
extract applicable data from the following table (titled EMPLOYEE):
>
--
ID_NUMBERCODEDATE
------ --- ---
12 VO20060914
12 XD20060913
12 AD20060912
12 WR20060911
12 AT20060910
45 VO20060914
45 XR20060913
45 AT20060912
45 AD20060911
45 AT20060910
78 AD20060914
78 AT20060913
78 VO20060912
78 AD20060911
78 AT20060910
>
I need to select ID_NUMBER
from EMPLOYEE
where CODE = 'VO'
>
caveat: I only want the ID_NUMBER(s) where the CODE = 'VO'
and the previous CODE (by DATE) = 'AD'
or the previous CODE (by DATE) = 'AD' with any CODE in between
except 'AT';
>
E.g., in the above example, the appropriate code should select
ID_NUMBER(s) 12 and 78 because
1. a VO code exists
2. an AD code (by DATE) precedes it
3. although 'AD' does not come immediately before 'VO' (in the
case of ID_NUMBER 12) 'AT' cannot be found in between
>
I hope I haven't confused anyone. Any help would be appreciated.

|||>>I'm a novice sql writer and need some help in writing a query to extract applicable data from the following table (titled EMPLOYEE): <<

First, let's clean up your missing DDL. The table name should tell us
what set of entities is modeled in the table; do you really have one
employee? Small firm! Try Personnel -- the collective name of the set
or something that tells us what the set is. Code is too vague --
postal code? Date is both too vague *and* a reserved word. A name
like "id_number" is also uselessly general; emp_id would be a better
choice. Since you did not post DDL, we have to guess at constaints and
keys. A skeleton of what you need is something like this:

CREATE TABLE PersonnelActions
(emp_id INTEGER NOT NULL,
action_date action_dateTIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (emp_id, foobar_date),
action_code CHAR(2) NOT NULL
CHECK (foobar_code IN ('VO', 'XD'))
);

You need to read a book on data modeling and ISO-11179 rules for names.
I would also look up the use of UPPERCASE for names -- it is the worst
way to code, being about 8-12% harder to detect misspellings. That is
why books and newspapers use lowercase.

Quote:

Originally Posted by

Quote:

Originally Posted by

>I only want the emp_id(s) where the action_code = 'VO'


and the previous action_code (by action_date) = 'AD'
or the previous action_code (by action_date) = 'AD' with any
action_code in between
except 'AT'; <<

SELECT DISTINCT emp_id
FROM PersonnelAction AS PVO,
PersonnelAction AS PAD
WHERE PVO.emp_id = PAD.emp_id
AND PVO.action_code = 'VO'
AND PAD.action_code = 'AD'
AND PAD.action_date < PVO.action_date
AND NOT EXISTS
(SELECT *
FROM PersonnelAction AS PAT
WHERE PAT.action_code = 'AT'
AND PAT.emp_id = PVO.emp_id
AND PAT_action_date BETWEEN PAD.action_date AND
PVO.action_date);|||>>I'm a novice sql writer and need some help in writing a query to extract applicable data from the following table (titled EMPLOYEE): <<

First, let's clean up your missing DDL. The table name should tell us
what set of entities is modeled in the table; do you really have one
employee? Small firm! Try Personnel -- the collective name of the set
or something that tells us what the set is. Code is too vague --
postal code? Date is both too vague *and* a reserved word. A name
like "id_number" is also uselessly general; emp_id would be a better
choice. Since you did not post DDL, we have to guess at constaints and
keys. A skeleton of what you need is something like this:

CREATE TABLE PersonnelActions
(emp_id INTEGER NOT NULL,
action_date action_dateTIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (emp_id, foobar_date),
action_code CHAR(2) NOT NULL
CHECK (foobar_code IN ('VO', 'XD'))
);

You need to read a book on data modeling and ISO-11179 rules for names.
I would also look up the use of UPPERCASE for names -- it is the worst
way to code, being about 8-12% harder to detect misspellings. That is
why books and newspapers use lowercase.

Quote:

Originally Posted by

Quote:

Originally Posted by

>I only want the emp_id(s) where the action_code = 'VO'


and the previous action_code (by action_date) = 'AD'
or the previous action_code (by action_date) = 'AD' with any
action_code in between
except 'AT'; <<

SELECT DISTINCT emp_id
FROM PersonnelAction AS PVO,
PersonnelAction AS PAD
WHERE PVO.emp_id = PAD.emp_id
AND PVO.action_code = 'VO'
AND PAD.action_code = 'AD'
AND PAD.action_date < PVO.action_date
AND NOT EXISTS
(SELECT *
FROM PersonnelAction AS PAT
WHERE PAT.action_code = 'AT'
AND PAT.emp_id = PVO.emp_id
AND PAT_action_date BETWEEN PAD.action_date AND
PVO.action_date);|||Alex,

It looks like CELKO gave you the answer along with a lot more
"advanced" advice. I hope you appreciate that I was trying to give you
a nudge to let you figure it out.

If you are still having trouble, post your SQL and we can make any
corrections.

Bill

alex wrote:

Quote:

Originally Posted by

Hello experts.
>
I'm a novice sql writer and need some help in writing a query to
extract applicable data from the following table (titled EMPLOYEE):
>
--
ID_NUMBERCODEDATE
------ --- ---
12 VO20060914
12 XD20060913
12 AD20060912
12 WR20060911
12 AT20060910
45 VO20060914
45 XR20060913
45 AT20060912
45 AD20060911
45 AT20060910
78 AD20060914
78 AT20060913
78 VO20060912
78 AD20060911
78 AT20060910
>
I need to select ID_NUMBER
from EMPLOYEE
where CODE = 'VO'
>
caveat: I only want the ID_NUMBER(s) where the CODE = 'VO'
and the previous CODE (by DATE) = 'AD'
or the previous CODE (by DATE) = 'AD' with any CODE in between
except 'AT';
>
E.g., in the above example, the appropriate code should select
ID_NUMBER(s) 12 and 78 because
1. a VO code exists
2. an AD code (by DATE) precedes it
3. although 'AD' does not come immediately before 'VO' (in the
case of ID_NUMBER 12) 'AT' cannot be found in between
>
I hope I haven't confused anyone. Any help would be appreciated.

|||--CELKO-- wrote:

Quote:

Originally Posted by

First, let's clean up your missing DDL. The table name should tell us
what set of entities is modeled in the table; do you really have one
employee? Small firm! Try Personnel -- the collective name of the set
or something that tells us what the set is.


What about "Employees"? But this is mostly grammatical pedantry; any
reasonable person will understand the implicit plural.

Quote:

Originally Posted by

Since you did not post DDL, we have to guess at constaints and
keys. A skeleton of what you need is something like this:
>
CREATE TABLE PersonnelActions
(emp_id INTEGER NOT NULL,
action_date action_dateTIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (emp_id, foobar_date),
action_code CHAR(2) NOT NULL
CHECK (foobar_code IN ('VO', 'XD'))
);


"something like", indeed. That penultimate line should be more like
CHECK (action_code IN ('VO', 'XD', 'AD', 'WR', 'AT', 'XR'))
or should refer to an ActionCodes table, or should simply be omitted.

Quote:

Originally Posted by

SELECT DISTINCT emp_id


Need to specify whether it comes from PVO or PAD.

Quote:

Originally Posted by

FROM PersonnelAction AS PVO,
PersonnelAction AS PAD
WHERE PVO.emp_id = PAD.emp_id
AND PVO.action_code = 'VO'
AND PAD.action_code = 'AD'
AND PAD.action_date < PVO.action_date


The following is clearer IMO:

FROM PersonnelAction AS PEarlier
JOIN PersonnelAction AS PLater ON PEarlier.emp_id = PLater.emp_id
AND PEarlier.action_date < PLater.action_date
WHERE PEarlier.action_code = 'AD'
AND PLater.action_code = 'VO'

Quote:

Originally Posted by

AND NOT EXISTS
(SELECT *
FROM PersonnelAction AS PAT
WHERE PAT.action_code = 'AT'
AND PAT.emp_id = PVO.emp_id
AND PAT_action_date BETWEEN PAD.action_date AND
PVO.action_date);

No comments:

Post a Comment