Hi everyone,
I have some problems on composite nonclustered indexes. I could not exatly understand their logic.
In my opininon, suppose that we have a table called Order and we create a composite nonclustered index on this table for OrderID column and OrderDate column. So I am using this query;
SELECT * FROM Order WHERE OrderID > 12 ORDER BY OrderDate
So in here, I think our first research is based on OrderID and ten after ordering our data pointer according to the OrderID and then our index is converted to an index which is based on OrderDate while performing ordering. So is this correct ?
Would you please explain this ?
Thanks
You can see all this by highlight the query (in Query Analyzer) and press Ctrl+L.|||
i dont know what exactly it is that you don't understand.
lets start with the clustered and non clustered index. you can only create one clustered index per table. however there can be queries that may run against the clustered index so non clustered index was introduced ...
Clustered Indexes
There can be only one clustered index on a table or view, because the clustered index key physically sorts the table or view. This type of index is particularly efficient for queries, because data records—also known as data pages—are stored in the leaf level of the B-tree. The sort order and storage location of a clustered index is analogous to a dictionary in that the words in a dictionary are sorted alphabetically and definitions appear next to the words.
When you create a primary key constraint in a table that does not contain a clustered index, SQL Server will use the primary key column for the clustered index key. If a clustered index already exists in a table, a nonclustered index is created on the column defined with a primary key constraint. A column defined as the PRIMARY key is a useful index because the column values are guaranteed to be unique. Unique values create smaller B-trees than redundant values and thus make more efficient lookup structures.
Nonclustered Indexes
You can create up to 250 nonclustered indexes (or 249 nonclustered indexes and one clustered index) on a table or view. You must first create a unique clustered index on a view before you can create nonclustered indexes. This restriction does not apply to tables, however. A nonclustered index is analogous to an index in the back of a book. You can use a book's index to locate pages that match an index entry. The database uses a nonclustered index to locate matching records in the database.
If a clustered index does not exist on a table, the table is unsorted and is called a heap. A nonclustered index created on a heap contains pointers to table rows. Each entry in an index page contains a row ID (RID). The RID is a pointer to a table row in a heap, and it consists of a page number, a file number, and a slot number. If a clustered index exists on a table, the index pages of a nonclustered index contain clustered index keys rather than RIDs. An index pointer, whether it is a RID or an index key, is called a bookmark. .
|||
Hi,
First of all, thanks for your replies.
Infact, I ask that what is the difference between using two different nonclustered index for each column and using a composite nonclustered index for the above example(in my first reply)
Thanks
There is huge difference merit.
If you have a composite index on (fCOL1, fCOL2) and a query
select * from T where fCOL2 = @.Value
Then the optimizer has no way to use your index. If you have 2 separate indexes, the optimizer will most likely use the index on fCOL2.
I'd reccomend that you refer to BOL for datailed coverage of indexes.
Hi,
Thanks for your reply but it came to me a little bit confusing.
By saying this, "Then the optimizer has no way to use your index"; I could not understand whay you meant completely. Would you please make it clear ?
And besides, I read the BOL and this is the question that I could not find the answer.
Thanks
Infact, What is the logic of composite nonclustered indexes ?
How can I imagine its algorithm in my mind ?
Thanks
Imagine an encyclopedia. Encyclopedia is sorted by topics, in alphabetic(lexicographic) order. This is the phisical or clustered index. It corresponds volume number and page number to each topic in it. E.g. if you want to search for topic "Database", what should you do? You look at the volumes. Each volume has the names of first and last topics written in hardcover. For example, the 3-rd volume of encylopedia might have topics "Cat" as first and "Easter" as last on the cover. As "Cat"<"Database"<"Easter" in ordinal lexicographical order, you can safely assume that the article "Database" is located in 3-rd volume.
This is analogous to clustered index root level lookup.
What do you do next? You open the volume somewhere in the middle, look at the topic of article. If it is, say, "Dog", then you have to look for "Database" in the first half of the book. If it is "Dao", then you should consider only the second part if the volume. Using this technique you can easily navigate to "Databases" in quite a few steps.
Now, how can we imagine a non-clustered index?
You surely know that each article in encyclopedia has an author. In the encylopedias I have been lucky to look at, each author is pointed at the end of the article.
Now, what if I want to navigate to all topics that author "merit" has written? Well, at the end of most books, there is a list called "index", which shows correspondence between <specific-names> and pages on which that <specific-name> was mentioned. For example, the philosophy book I have just finished, has a whole page of index for "Plato". It's like Plato-2,3, 5,6,10,20... That is, if I want to look for an article by "merit" in encyclopedia I have to just open the end-book index, and look at the pages where "merit" has articles.
To this point, there is a nuance that makes a difference between most of book indexes and non-clustered indexes in database. While indexes in books point to actual pages, database non-clustered index points to clustered index entries. This means that if merit has authored the article "Database" in encylopedia, which is located in page 253 of volume 3, the encyclopedia index contains the entry "merit-volume 3, page 253", while a non-clustered index contains the entry 'merit-Database'. So, navigating in non-clustered index structure is like navigating the authors list of encyclopedia. As soon as you have found the author, you get all the articles he has authored, and, using the clustered index, easily navigate to the desired article.
Now assume that the authors index is organized in format <Name-Surname>. In that case, it will contain entry Andranik Khachatryan-"Indexes pseudo-explained using Plato" ;).
Now you see why you cannot use the index to navigate to authors whose surname is "Khachatryan"? Because the authors index structure is sorted by Firstname, so, if you wish to look for a specific surname, you must look up all entries! In language of databases, this is always much less effective than just scanning the encylopedia ... oops ... database for Khachatryan.
The analogy I described here is not full and contains some inaccuracies. It works only at first appoximation.|||
REAL WORLD SCENARIO
lets say i have a junction table for student and subject
which is a many to many relation
and i call it table "studentsgrades"
here's how the table look like
recno | studentno | subjectid | prelim | midterm |final
studentgrade has "recordno" as primary key
now i want to impose a unique index on studentno and subjectid
where studentno is a foreign key from students table and subjectid is a foreign key
from subjects table. the solution requires that for every subject there could only be one instance of "student 1" though different students such as student 2 and 3 maybe in the same subject.
now all i have to do to achive this solution is to create a unique composite index on students and subjects.
"recno" is clustered pk because of many inserts and that recno is used to relate to table "adviserscomments" which has recno as fk.
now there are two requirements which are to create a report that will print the "students grades" for all of his subject and the "class record" which will display the grades of every student in the class
the "student grades" uses the query
select studentno,subjectid,prelim,midterm, finals where studentno = @.studentno
the class record uses this query
select subjectid,studentno,prelim,midterm,final where subjectid = @.subjectid
these query will not make use of the pk so to make the solution work fast an index must be made.
to make it realy fast i make a covering index A on studentno,subjectid,prelim,midterm,final
only "student grades" requirement will benefit from this index so
i have to make covering index B on
subjectid,studentno,prelim,midterm,final
A covering index, which is a form of a composite index, includes all of the columns referenced in SELECT, JOIN, and WHERE clauses of a query. Because of this, the index contains the data you are looking for and SQL server doesn't have to look up the actual data in the table, reducing logical and/or physical I/O, and boosting performance.
to summarize composite non clustered index can be used to
1. implement uniqueness of more than 2 columns
2. implement covering index
for more reading on composite index pls consult this link
http://www.sql-server-performance.com/composite_indexes.asp
|||Hi,
Indeed
thanks very much to both of you. Both of your post's are indeed
beneficial for me however, something still make me confused which I
would like to share with you.
I learned that while using composite
indexes, we must give importance to the order of the indexes. So if we
use where clause , we must put the column leftmost which we use in the
where clause. So, I am confused that what is the usage of other column in the index part in here ?
Thanks
thats the index order of selectivity and sort-priority order starting from the left
A composite index is an index on two or more columns of a table. You should consider performance when creating a composite index because the order of columns in the index has a measurable effect on data retrieval speed. Generally, the most restrictive value should be placed first for optimum performance. However, the columns that will always be specified should be placed first. |
About selectivity,Sometimes two or more columns, each with poor selectivity, can be combined to form a composite index with good selectivity.
|||Hi,
Thanks for this reply again.
In here,
SELECT * FROM Order WHERE OrderID > 12 ORDER BY OrderDate
we create composite nonclustered index for both OrderID and OrderDate column.
Leftmost is OrderID.
So when our first research is happening(Where clause), the only nonclustered index which is used is OrderID index. And then, when we pass through the second search(ORDER BY clause), OrderDate index become activated.
So we can say that the seleection of indexes in composite indexes is determined according to the situation of the query at that time.
Hence, is this all correct ?
Thanks again for all clarifications.
Best wishes,
Mert
Also,
we should not forget that first member of the composite index must be the most selective column for increasion of the performance.
No comments:
Post a Comment