Saturday, February 25, 2012

Complex Joins

Hi,
I am unsure about the best approach to the following :
I have a database where I have to store about 60000 different Objects, and
each Object has a number of Properties attached to it (the number of
properties attached and the type of properties can vary per object).
Thus far my table structure looks like this :
Table Objects (
Id BigInt,
Descr varchar(250),
...
)
Table Properties (
Id BigInt,
Descr varchar(50)
)
Table PropertyValues (
Id BigInt,
ObjectId BigInt,
PropertyId BigInt,
PropertyValue varchar(50)
)
Now my application lets the user select specific rows from the Objects table
using selections of different combinations of propertyvalues. This leads to
rather complex queries that sometimes contain between 1 to 15 joins of the
same table (PropertyValues).
Something like this:
Select Id, Descr from Object INNER JOIN
PropertyValues A on A.PropertyId = 1 AND A.PropertyValue = 'ABC' INNER JOIN
PropertyValues B on B.PropertyId = 2 AND B.PropertyValue = 'DEF' INNER JOIN
...
PropertyValues K on K.PropertyId = 11 AND K.PropertyValue = 'AnotherValue'
I would like to know if this is the right way to create this functionality
or should I think of another table structure.
Thanks
HermanHerman
I think you designed your database well. That enforce many-many relationship
between objects and property tables.
Now regarding to queries.
I'd start to write the query with follow structure.
Select Id, Descr from Object O INNER JOIN PropertyValues PV
ON O.ObjectId=PV.ObjectId INNER JOIN Properties P
ON PV.PropertyId =P.PropertyId
WHERE PV.PropertyId IN ('DEF' .....) AND
PV.ObjectId IN (1,2....)
Note: I don't know exactly what are your business requirements ,therfore if
you something wrong try modify it for
your needs.
"Herman Lammers" <hlammers_REMOVE_THIS@.allround-ipDOTnl> wrote in message
news:uApSPyhnDHA.1676@.TK2MSFTNGP09.phx.gbl...
> Hi,
> I am unsure about the best approach to the following :
> I have a database where I have to store about 60000 different Objects, and
> each Object has a number of Properties attached to it (the number of
> properties attached and the type of properties can vary per object).
> Thus far my table structure looks like this :
> Table Objects (
> Id BigInt,
> Descr varchar(250),
> ...
> )
> Table Properties (
> Id BigInt,
> Descr varchar(50)
> )
> Table PropertyValues (
> Id BigInt,
> ObjectId BigInt,
> PropertyId BigInt,
> PropertyValue varchar(50)
> )
> Now my application lets the user select specific rows from the Objects
table
> using selections of different combinations of propertyvalues. This leads
to
> rather complex queries that sometimes contain between 1 to 15 joins of the
> same table (PropertyValues).
> Something like this:
> Select Id, Descr from Object INNER JOIN
> PropertyValues A on A.PropertyId = 1 AND A.PropertyValue = 'ABC' INNER
JOIN
> PropertyValues B on B.PropertyId = 2 AND B.PropertyValue = 'DEF' INNER
JOIN
> ...
> PropertyValues K on K.PropertyId = 11 AND K.PropertyValue = 'AnotherValue'
> I would like to know if this is the right way to create this functionality
> or should I think of another table structure.
> Thanks
> Herman
>
>|||Although it is possible to implement a generic solution for any
application with an OO design like this, don't go there. Performance
will be horrendous and it will be difficult to enforce data integrity
because all user data will be stored in the single varchar(50)
PropertyValue column. This is problematic for many reasons.
Note that I'm not saying that one should never implement this type of
design. OO design may be appropriate for infrequently accessed
configuration data, such as meta-data loaded at application startup but
should generally be avoided in a relational database. It is better to
stick with the relational model in a relational database like SQL
Server.
Consider the schema of the Northwind Orders table below and ask yourself
how you would perform the following with your schema:
- ensure OrderID is unique (primary key)
- ensure data is valid for the appropriate data type (e.g. int and
datetime)
- insert an order with a 60 character ShipAddress or unicode data
- enforce referential integrity (e.g. an order must have a customer)
- query data (e.g. print shipping labels)
- delete data for a given order
CREATE TABLE [Orders] (
[OrderID] [int] IDENTITY (1, 1) NOT NULL ,
[CustomerID] [nchar] (5) NULL ,
[EmployeeID] [int] NULL ,
[OrderDate] [datetime] NULL ,
[RequiredDate] [datetime] NULL ,
[ShippedDate] [datetime] NULL ,
[ShipVia] [int] NULL ,
[Freight] [money] NULL CONSTRAINT [DF_Orders_Freight] DEFAULT (0),
[ShipName] [nvarchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
,
[ShipAddress] [nvarchar] (60) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL ,
[ShipCity] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
,
[ShipRegion] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL ,
[ShipPostalCode] [nvarchar] (10) COLLATE
SQL_Latin1_General_CP1_CI_AS NULL ,
[ShipCountry] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS
NULL ,
CONSTRAINT [PK_Orders] PRIMARY KEY CLUSTERED
(
[OrderID]
),
CONSTRAINT [FK_Orders_Customers] FOREIGN KEY
(
[CustomerID]
) REFERENCES [Customers] (
[CustomerID]
),
CONSTRAINT [FK_Orders_Employees] FOREIGN KEY
(
[EmployeeID]
) REFERENCES [Employees] (
[EmployeeID]
),
CONSTRAINT [FK_Orders_Shippers] FOREIGN KEY
(
[ShipVia]
) REFERENCES [Shippers] (
[ShipperID]
)
)
--
Hope this helps.
Dan Guzman
SQL Server MVP
--
SQL FAQ links (courtesy Neil Pike):
http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
http://www.sqlserverfaq.com
http://www.mssqlserver.com/faq
--
"Herman Lammers" <hlammers_REMOVE_THIS@.allround-ipDOTnl> wrote in
message news:uApSPyhnDHA.1676@.TK2MSFTNGP09.phx.gbl...
> Hi,
> I am unsure about the best approach to the following :
> I have a database where I have to store about 60000 different Objects,
and
> each Object has a number of Properties attached to it (the number of
> properties attached and the type of properties can vary per object).
> Thus far my table structure looks like this :
> Table Objects (
> Id BigInt,
> Descr varchar(250),
> ...
> )
> Table Properties (
> Id BigInt,
> Descr varchar(50)
> )
> Table PropertyValues (
> Id BigInt,
> ObjectId BigInt,
> PropertyId BigInt,
> PropertyValue varchar(50)
> )
> Now my application lets the user select specific rows from the Objects
table
> using selections of different combinations of propertyvalues. This
leads to
> rather complex queries that sometimes contain between 1 to 15 joins of
the
> same table (PropertyValues).
> Something like this:
> Select Id, Descr from Object INNER JOIN
> PropertyValues A on A.PropertyId = 1 AND A.PropertyValue = 'ABC' INNER
JOIN
> PropertyValues B on B.PropertyId = 2 AND B.PropertyValue = 'DEF' INNER
JOIN
> ...
> PropertyValues K on K.PropertyId = 11 AND K.PropertyValue ='AnotherValue'
> I would like to know if this is the right way to create this
functionality
> or should I think of another table structure.
> Thanks
> Herman
>
>|||Hi Dan,
thanks for your reply. Creating this generic solution was not exactly what I
had in mind.
The model I described is just a (small) part of traditional relational
model.
The reason I came up with this model is that I wanted to give my customer
the possibility of
assigning extra fields (Properties) to (groups of) products (Objects). This
structure is then being used for product selection only.
My customer sells fasteners which can be divided into several product groups
(washers, nuts, bolts)
Apart from product descr, Product price etc. there are numerous (more then I
ever thought) other properties which describe the product, and these depend
on the type of product. (washers have an inner and outer diameter, and a
thickness, bolts have a product length, headtype etc). This variety brought
me to this model.
Any values critical to the rest of the application would be taken out of the
structure, and stored in the product table.
Thanks anyway
Herman
"Dan Guzman" <danguzman@.nospam-earthlink.net> schreef in bericht
news:ePCz1VinDHA.2080@.TK2MSFTNGP10.phx.gbl...
> Although it is possible to implement a generic solution for any
> application with an OO design like this, don't go there. Performance
> will be horrendous and it will be difficult to enforce data integrity
> because all user data will be stored in the single varchar(50)
> PropertyValue column. This is problematic for many reasons.
> Note that I'm not saying that one should never implement this type of
> design. OO design may be appropriate for infrequently accessed
> configuration data, such as meta-data loaded at application startup but
> should generally be avoided in a relational database. It is better to
> stick with the relational model in a relational database like SQL
> Server.
> Consider the schema of the Northwind Orders table below and ask yourself
> how you would perform the following with your schema:
> - ensure OrderID is unique (primary key)
> - ensure data is valid for the appropriate data type (e.g. int and
> datetime)
> - insert an order with a 60 character ShipAddress or unicode data
> - enforce referential integrity (e.g. an order must have a customer)
> - query data (e.g. print shipping labels)
> - delete data for a given order
>
> CREATE TABLE [Orders] (
> [OrderID] [int] IDENTITY (1, 1) NOT NULL ,
> [CustomerID] [nchar] (5) NULL ,
> [EmployeeID] [int] NULL ,
> [OrderDate] [datetime] NULL ,
> [RequiredDate] [datetime] NULL ,
> [ShippedDate] [datetime] NULL ,
> [ShipVia] [int] NULL ,
> [Freight] [money] NULL CONSTRAINT [DF_Orders_Freight] DEFAULT (0),
> [ShipName] [nvarchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> ,
> [ShipAddress] [nvarchar] (60) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL ,
> [ShipCity] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
> ,
> [ShipRegion] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL ,
> [ShipPostalCode] [nvarchar] (10) COLLATE
> SQL_Latin1_General_CP1_CI_AS NULL ,
> [ShipCountry] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS
> NULL ,
> CONSTRAINT [PK_Orders] PRIMARY KEY CLUSTERED
> (
> [OrderID]
> ),
> CONSTRAINT [FK_Orders_Customers] FOREIGN KEY
> (
> [CustomerID]
> ) REFERENCES [Customers] (
> [CustomerID]
> ),
> CONSTRAINT [FK_Orders_Employees] FOREIGN KEY
> (
> [EmployeeID]
> ) REFERENCES [Employees] (
> [EmployeeID]
> ),
> CONSTRAINT [FK_Orders_Shippers] FOREIGN KEY
> (
> [ShipVia]
> ) REFERENCES [Shippers] (
> [ShipperID]
> )
> )
> --
> Hope this helps.
> Dan Guzman
> SQL Server MVP
> --
> SQL FAQ links (courtesy Neil Pike):
> http://www.ntfaq.com/Articles/Index.cfm?DepartmentID=800
> http://www.sqlserverfaq.com
> http://www.mssqlserver.com/faq
> --
> "Herman Lammers" <hlammers_REMOVE_THIS@.allround-ipDOTnl> wrote in
> message news:uApSPyhnDHA.1676@.TK2MSFTNGP09.phx.gbl...
> > Hi,
> >
> > I am unsure about the best approach to the following :
> >
> > I have a database where I have to store about 60000 different Objects,
> and
> > each Object has a number of Properties attached to it (the number of
> > properties attached and the type of properties can vary per object).
> >
> > Thus far my table structure looks like this :
> >
> > Table Objects (
> > Id BigInt,
> > Descr varchar(250),
> > ...
> > )
> >
> > Table Properties (
> > Id BigInt,
> > Descr varchar(50)
> > )
> >
> > Table PropertyValues (
> > Id BigInt,
> > ObjectId BigInt,
> > PropertyId BigInt,
> > PropertyValue varchar(50)
> > )
> >
> > Now my application lets the user select specific rows from the Objects
> table
> > using selections of different combinations of propertyvalues. This
> leads to
> > rather complex queries that sometimes contain between 1 to 15 joins of
> the
> > same table (PropertyValues).
> >
> > Something like this:
> >
> > Select Id, Descr from Object INNER JOIN
> > PropertyValues A on A.PropertyId = 1 AND A.PropertyValue = 'ABC' INNER
> JOIN
> > PropertyValues B on B.PropertyId = 2 AND B.PropertyValue = 'DEF' INNER
> JOIN
> > ...
> > PropertyValues K on K.PropertyId = 11 AND K.PropertyValue => 'AnotherValue'
> >
> > I would like to know if this is the right way to create this
> functionality
> > or should I think of another table structure.
> >
> > Thanks
> >
> > Herman
> >
> >
> >
>|||On Wed, 29 Oct 2003 16:57:44 +0100, "Herman Lammers"
<hlammers_REMOVE_THIS@.allround-ipDOTnl> wrote:
>Hi Dan,
>thanks for your reply. Creating this generic solution was not exactly what I
>had in mind.
>The model I described is just a (small) part of traditional relational
>model.
>The reason I came up with this model is that I wanted to give my customer
>the possibility of
>assigning extra fields (Properties) to (groups of) products (Objects). This
>structure is then being used for product selection only.
>My customer sells fasteners which can be divided into several product groups
>(washers, nuts, bolts)
>Apart from product descr, Product price etc. there are numerous (more then I
>ever thought) other properties which describe the product, and these depend
>on the type of product. (washers have an inner and outer diameter, and a
>thickness, bolts have a product length, headtype etc). This variety brought
>me to this model.
Well, it's the proper theoretical normalization of the table, but it
seems, to me, like you might want groupings... especially because I
think you'll end up having a hundred different records that all say
"1/4 inch diameter"
Maybe a bolt table, a washers table, etc., all with a foreign "item
number" key - foreign, because you could autonumber it on another
table that has two columns, "item number", "item type" - and each
table would have all of the necessary columns to list the necessary
properties. If you need additional properties for select items, then
you can create another linked table.
So, basically, you have
Item Listing:
"ItemNumber", "ItemType"
Washer Properties:
"Item Number" "InnerDiameter", "OuterDiameter", "Material", "Shape",
"Thickness"
I think that you want the "ItemType" to be a direct link to the
relevant table, but I don't know how to do that (or whether it *can*
be done).
This will cut down on the number of joins and make your table
structure a lot more understandable.
(Warning: I'm a DB administrator, not a developer, and your advice is
worth half of what you paid me for it.)

No comments:

Post a Comment