Skip to main content

Clustered Index in SQL Server


What is a clustered index?

The clustered index is a special index which sorts and stores the data in the table or view based on their key values. There can be only one clustered index per table because the data rows themselves can be stored in only one order. If we talk in term of B-Tree design, the leaf nodes of the index store the data for the rest of the columns in the table so when a lookup is performed on this type of index there are no other structures that need to be referenced.

Why create clustered indexes?

Having clustered index in the table is beneficial; first, when querying the table, there is only required IO operations occurred except extra lookup to get any/all of the non-key column data. Second, it gives us a way to recognize the table data. If there is no clustered index on the table then data stored disorganized in a heap structure. When our data becomes defragmented due to DML operations then the only one way to fix this fragmentation is to reload all the data into a new table. With a clustered index in the table, we can run index reorganized or rebuild to address the fragmentation.

When clustered indexes should be created?

Generally, we use an index on those columns which heavily used in WHERE clause portion in the query. As we know query performs huge range scans on those columns in the WHERE clause can be beneficial from having a clustered index on these columns. In both cases, we will use fewer disk IO and in some cases (depending on the ORDER BY clause) we can save ourselves memory and CPU by not having to perform a sort since data is already ordered.

How to create a clustered index

A clustered index can be created on the table either through a primary key constraint or simply using create index statement. In both cases, we simply have to add the CLUSTERED keyword to either the ALTER TABLE or CREATE INDEX statement as shown in the below TSQL examples.


-- When we create a primary key
ALTER TABLE Sales.BookInventory ADD CONSTRAINT PK_BookInventory_BookID
PRIMARY KEY CLUSTERED (BookID ASC);

-- Without constraints
CREATE CLUSTERED INDEX IX_BookInventory_BookID ON Sales.BookInventory (BookID ASC);


Here I will highlight of some important thing on the clustered index and will analysis of myth about the clustered index.

1-    It is important to know; only one clustered index can be defined on each table in your database.
2-    Changing the clustered index form one column to another column may be a bit complicated and does require some downtime. Since all non-clustered indexes on table reference the row data in the clustered index if we simply drop the current clustered index and create a new one then all of the non-clustered indexes will end up being updated twice.
3-    When clustered index will be dropped, all data of the columns will move into heap it will be updating seconds time when new the new clustered index is created and all the column data move into the structure.
4-    Dropped and create new index may probably increase the fragmentation which is poison for index. So we have to rebuild all the indexes.
5-    Many people or new DBA think, we can create clustered index only on primary key column(s), this is not true, you can create clustered index on another column too which probably used more in WHERE clause. Suppose you have created a table and create an index on primary key which used rarely in WHERE clause while DateTime columns use mostly in the query, so it would be better if you create clustered index on DATETIME column.
6-    Remember, it would be better if the clustered index column has unique data and sort in length.

Query to change the clustered index

I above points I mentioned, changing the clustered index in tables is may be dangerous but following query may help you.

Suggestion: If you are not DBA then don’t take the risk, contact to your DBA he will help you.



DROP INDEX AK_BookOrder_rowguid ON Sales.BookOrder;
DROP INDEX IX_BookOrder_ProductID ON Sales.BookOrder;
ALTER TABLE Sales.BookOrder
   DROP CONSTRAINT PK_BookOrder_BookOrderID_BookOrderID;
CREATE CLUSTERED INDEX IX_BookOrder_ProductID
   ON Sales.BookOrder (ProductID ASC);
ALTER TABLE Sales.BookOrder
   ADD CONSTRAINT PK_BookOrder_BookOrderID_BookOrderID
   PRIMARY KEY NONCLUSTERED (BookOrderID ASC, BookOrderID ASC);
CREATE UNIQUE NONCLUSTERED INDEX AK_BookOrder_rowguid
   ON Sales.BookOrder (rowguid ASC);


Popular posts from this blog

Add day to ISODate in MongoDB

We can use $add operator to add days in ISODate in mongodb, $add is the Arithmetic Aggregation Operator which adds number and date in mongodb.
Syntax:

{ $add: [ <expression1>, <expression2>, ... ] }

Note:  If one of the argument is date $add operator treats to other arguments as milliseconds to add to the date.
Example: Suppose we have a Test collection as below.

{"Title" : "Add day to ISODate in MongoBD","CreatedDate" : ISODate("2016-07-07T08:00:00.000Z")}

Query to add 2 days in CreatedDate

db.Test.aggregate([      { $project: { Title: 1, AddedDate: { $add: [ "$CreatedDate", 2*24*60*60000 ] } } }    ])

Result:

{ "_id" : ObjectId("579a1567ac1b3f3732483de0"), "Title" : "Add day to ISODate in MongoBD", "AddedDate" : ISODate("2016-07-09T08:00:00.000Z") }

Note: As mentioned in above note we have to convert days in millisecond because $add operator treat to other arg…

What is difference between UNION and UNION ALL in SQL Server

We use UNION and UNION ALL operator to combine multiple results set into one result set.
UNION operator is used to combining multiple results set into one result set but removes any duplicate rows. Basically, UNION is used to performing a DISTINCT operation across all columns in the result set. UNION operator has the extra overhead of removing duplicate rows and sorting result.
UNION ALL operator use to combine multiple results set into one result set but it does not remove any duplicate result. Actually, this does not remove duplicate rows so it is faster than the UNION operator. If you want to combine multiple results and without duplicate records then use UNION otherwise UNION ALL is better.
Following some rules for using UNION/UNION ALL operator
1.The number of the column should be the same in the query's when you want to combine them. 2.The column should be of the same data type. 3.ORDER BY clause can be applied to the overall result set not within each result set.
4.Column name of …