Creating an index on a table with no primary key

Category : Blogs Published : July 6, 2012 User Rating : 3.5 Stars      Views : 3 Stars
One Click Feedback
Please help us to improve the site by rating the quality of this article by clicking a button below.




Recently a developer came up to me and started asking questions about a problem he was experiencing with a stored procedure he’d written to query a large table. He’d been looking at the execution plan and noticed that the stored procedure was table scanning one of the tables that it was querying.

The query was fairly straightforward with two columns specified in the WHERE predicate, but the table was large. He had quite correctly decided to create an index on those two columns. Surprisingly this did not seem to resolve the situation as the stored procedure continued to table scan the table. At this point he asked me to look at the problem.

The first thing I did was to simplify the query as much as possible to make it easier to identify the problem. Here is the table design and the simplified version of the query in the stored procedure that caused the table scanning :

Table Design

Query :
SELECT Title,Forename,Surname
FROM   Member
WHERE  DateAppointed<'23 Aug 2008' AND DateLeft>'23 Aug 2008'
The index that he had tried was :
CREATE NONCLUSTERED INDEX idx1 ON dbo.Member
(     DateLeft ASC,
      DateAppointed ASC )
Switching on the ‘Include Actual Execution Plan’, running the query and looking at the graphical output in the ‘Execution Plan’ tab in SSMS confirmed that the query is table scanning :

Table Scan

On the face of there was no obvious reason why the query should not have made use of this index. After trying a few things and not succeeding in getting the query to use the index I decided to see what the Database Engine Tuning Advisor (DETA) made of the query. At this point I suddenly realised what the problem was – there was no primary key on the table.

The problem was that although the optimiser could use the index to identify the correct rows in the table to meet the where condition, it would then have to go back to the table to get the columns specified in the select statement, but it required a primary key on the table to be able to do that look up.

If this was a database we owned then we could have added a primary key, either by identifying an existing column to use or to add an extra column and specifying it as an identity column. Unfortunately it was a third party supplied database and it wasn’t practical to make structural changes.

The solution was to specify an index which had include columns covering the columns specified in the select statement :
CREATE NONCLUSTERED INDEX idx_Member_1 ON dbo.Member
(     DateLeft ASC,
      DateAppointed ASC )
INCLUDE (Title,Forename,Surname)
This is the index specified by the DETA and resolves the problem because all the information is contained in the index. This is because the INCLUDE clause contains all the columns in the SELECT part of the query – no further look up is required on the underlying table and the execution plan confirms that the query does indeed do an index seek using the new index :
Index Seek

All in all a neat solution to the problem in this instance, though I can’t help feeling that it would still have been better to have had a primary key on the table in the first place.

Related Articles

The following articles may also be of interest :

Link back to this article : https://www.sqlmatters.com/Articles/Creating an index on a table with no primary key.aspx

Keywords

Blog,index,primary key


Comments
Post by Dave Internet on Fri 21 Feb 2020 17:45. Report Inappropriate Post

You should have actually created two indexes, one for DateLeft and one for DateAppointed. The way you have it in your example you made an index on both dateleft and dateappointed, meaning the order of dateleft will be completely dependant on dateappointed

So the first WHERE clause would be dealt with efficiently, the optimiser can make best use of your index to find all rows which match the WHERE DateAppointed<'23 Aug 2008' condition.

For your second where clause, AND DateLeft>'23 Aug 2008', you are at the mercy of full table scans again, albeit this time you're doing full table scans on your index. The index first and foremost puts the rows in DateAppointed order, then after that, it sorts by DateLeft, so if you want to scan all dateleft values (which your second where clause would require), the data is not ordered by DateLeft in your index, so you need to do multiple scans.

Because you already reduced the scope of your data with your first where clause, in reality the query is probably quite quick, but on a very very large dataset this would still take hours.
Post by Luis on Sat 02 Apr 2022 02:13. Report Inappropriate Post

I don't believe this is an issue here because this index is being built for the particular query. The index key columns are in the same order as they are in the query so that inefficiency does not crop up (at least not with MS SQL). I think the execution plan would look different that what was shown if the query was doing what you describe.

Post a comment   No login required !

 
Name :   Email :   Website :  
Will be displayed alongside your comment
Not displayed Optional, but displayed if entered