Friday, February 24, 2012

Complex DB Search Forms (Store Proc vs. Complex Where)

I have web forms with about 10-15 optional search parameters (fields) for a give table. Each item (textbox) in the form is treated as an AND condition.

Right now I build complex WHERE clauses based on wheather data is present in a textbox and AND each one in the clause. Also, if a particular field is "match any word", i get a ANDed set of OR's. As you can imagine, the WHERE clause gets quite large.

I build clauses like this (i.e., 4 fields shown):

SELECT * from tableName WHERE (aaa like '%data') AND (bbb = 'data') AND (ccc like 'data%') AND ( (xxx like '%data') OR (yyy like '%data%') )

My question is, are stored procedures better for building such dynamic SQL clauses? I may have one field or all fifteen. I've written generic code for building the clauses, but I don't know much about stored procedures and am wondering if I'm making this more difficult on myself.You can achieve the exact same result either way you do it. It is more object oriented to pass the parameters into the SP and then build your WHERE clause. Are you good at TSQL? If so then why not put them in a SP. If you have never written an SP before then stay with what you know and write the code in your page.

I have done it both ways many many many times, it all depends on the situation. You will have to write the same code in the SP than where it is now, just different syntax.

HTH|||Ok, that makes me feel better. I know it's generally prefered to use SP, but I've never written one. If I did make one, I guess it'd be like a function with 15 parameters?

I'm doing enough learning with ASP.NET, so I'll hold off on SP's for now.

I'll have to find me a good SP book. Performance isn't an issue now, but hopefully it will be!|||You are not really giving up that much performance. My applications generally use very large databases (11 million+ records) with very complex and dynamic Where clauses. If all tables are indexed properly then the performance difference is not very evident.

Use your current constraints as your guide (time, etc.).|||If you are curious how this issue was resolved at this ASP.NET Forum (which we are all posting at now), I am posting below the stored procedure used to search the fourms. You can see that a parameter @.SearchTerms is used to hold the whole where clause. I like this way because it is easier to build dynamically the where clause in c# or vb.net and then make use of the better performance that sp provides. Furthermore this sp below gives you the ability to display only a given number of results (very convenient to use with a datagrid with paging)

CREATE PROCEDURE forums_GetSearchResults
(
@.SearchTerms nvarchar(500),
@.Page int,
@.RecsPerPage int,
@.UserName nvarchar(50)
)
AS
CREATE TABLE #tmp
(
ID int IDENTITY,
PostID int
)
DECLARE @.sql nvarchar(1000)
SET NOCOUNT ON
SELECT @.sql = 'INSERT INTO #tmp(PostID) SELECT PostID ' +
'FROM Posts P (nolock) INNER JOIN Forums F (nolock) ON F.ForumID = P.ForumID ' +
@.SearchTerms + ' ORDER BY ThreadDate DESC'
EXEC(@.sql)
-- ok, all of the rows are inserted into the table.
-- now, select the correct subset
DECLARE @.FirstRec int, @.LastRec int
SELECT @.FirstRec = (@.Page - 1) * @.RecsPerPage
SELECT @.LastRec = (@.Page * @.RecsPerPage + 1)
DECLARE @.MoreRecords int
SELECT @.MoreRecords = COUNT(*) FROM #tmp -- WHERE ID >= @.LastRec

-- Select the data out of the temporary table
IF @.UserName IS NOT NULL
SELECT
T.PostID,
P.ParentID,
P.ThreadID,
P.PostLevel,
P.SortOrder,
P.UserName,
P.Subject,
P.PostDate,
P.ThreadDate,
P.Approved,
P.ForumID,
F.Name As ForumName,
MoreRecords = @.MoreRecords,
Replies = (SELECT COUNT(*) FROM Posts P2 (nolock) WHERE P2.ParentID = P.PostID AND P2.PostLevel != 1),
P.Body,
P.TotalViews,
P.IsLocked,
HasRead = 0 -- not used
FROM
#tmp T
INNER JOIN Posts P (nolock) ON
P.PostID = T.PostID
INNER JOIN Forums F (nolock) ON
F.ForumID = P.ForumID
WHERE
T.ID > @.FirstRec AND ID < @.LastRec AND
(P.ForumID NOT IN (SELECT ForumID from PrivateForums) OR
P.ForumID IN (SELECT ForumID FROM PrivateForums WHERE RoleName IN (SELECT RoleName from UsersInRoles WHERE username = @.UserName)))
ELSE
SELECT
T.PostID,
P.ParentID,
P.ThreadID,
P.PostLevel,
P.SortOrder,
P.UserName,
P.Subject,
P.PostDate,
P.ThreadDate,
P.Approved,
P.ForumID,
F.Name As ForumName,
MoreRecords = @.MoreRecords,
Replies = (SELECT COUNT(*) FROM Posts P2 (nolock) WHERE P2.ParentID = P.PostID AND P2.PostLevel != 1),
P.Body,
P.TotalViews,
P.IsLocked,
HasRead = 0 -- not used
FROM
#tmp T
INNER JOIN Posts P (nolock) ON
P.PostID = T.PostID
INNER JOIN Forums F (nolock) ON
F.ForumID = P.ForumID
WHERE
T.ID > @.FirstRec AND ID < @.LastRec AND
P.ForumID NOT IN (SELECT ForumID from PrivateForums)

SET NOCOUNT OFF|||::You can achieve the exact same result either way you do it.

No, not EXACTLY the same result.

::It is more object oriented to pass the parameters into the SP and then build your WHERE
::clause.

Hmpf. Please dont tell me you believe this.

I mean, frankly, in how far is it more OBJECT ORIENTED to actually write a non-objectoriented METHOD (an SP is nothing more)?

More strucutre, maybe, better practice, maybe, but not more "object oriented".

Some negatives:

* Be sure to have automatic recompile switched on for this SP, as otherwise you just say goodbye to performance for your type of query.

Let see the last post:

::Furthermore this sp below gives you the ability to display only a given number of results
::(very convenient to use with a datagrid with paging)

Yea, righ. It is not exactly like you could not do it right in dynamic SQL either. Frankly, the correct way to ask for only X result is "SELECT TOP X". I somehow miss this on the SP.

* YOu dont need a temp tabl. Use query materialization to use actually use the optimizer. The use of a temp table is about the last way I would have tried to solve this.
* Frankly, the approach of selecting ALL posts into a temp table, just to then only select a subset of this sounds horrible performance wise.
* The "IF" makes not really a lot of sense - the two cases can be put into one pretty easily.|||>>No, not EXACTLY the same result.

Is the end result not a Query ?|||thona,

perhaps you are right about the IF statement. I think this stored procedure together with the ASP.NET Forums are written by a microsoft team (so it says in the agreement that comes with the free download of the forums).

Perhaps there is a better way. could you please give example of "Use query materialization to use actually use the optimizer" because I am not very familiar with this technique?

SELECT TOP ... will only work the first time you select from the database. but how would you handle displaying next result sets?
That is why the proc takes @.page and @.recsperpage parameter in order to determine from which to which result to return. If the user sees 10 results per page and he is requesting the second result set then the proc will calculate @.FirstRec and @.LastRec. But you cannot use these two in a querry against the whole database. you can apply them in a second select only against the relevant results.

If I am wrong, I will be grateful if you help me understand my mistake as efficient searching is very important indeed

No comments:

Post a Comment