Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate SQL, Power BI, Fabric, and AI community-led event. March 31 - April 2. Use code MSCUST for a $150 discount.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Applies to:
SQL Server
Azure SQL Database
Azure SQL Managed Instance
Analytics Platform System (PDW)
SQL database in Microsoft Fabric
This article helps you decide when and how to perform index maintenance. It covers concepts such as index fragmentation and page density, and their impact on query performance and resource consumption. It describes index maintenance methods, reorganizing an index and rebuilding an index, and suggests an index maintenance strategy that balances potential performance improvements against resource consumption required for maintenance.
Note
This article does not apply to a dedicated SQL pool in Azure Synapse Analytics. For information on index maintenance for a dedicated SQL pool in Azure Synapse Analytics, see Indexing dedicated SQL pool tables in Azure Synapse Analytics.
What is index fragmentation and how it impacts performance:
In B-tree (rowstore) indexes, fragmentation exists when indexes have pages in which the logical ordering within the index, based on the key values of the index, does not match the physical ordering of index pages.
Note
Documentation uses the term B-tree generally in reference to indexes. In rowstore indexes, the Database Engine implements a B+ tree. This does not apply to columnstore indexes or indexes on memory-optimized tables. For more information, see the SQL Server and Azure SQL index architecture and design guide.
The Database Engine automatically modifies indexes whenever insert, update, or delete operations are made to the underlying data. For example, the addition of rows in a table can cause existing pages in rowstore indexes to split, making room for the insertion of new rows. Over time these modifications can cause the data in the index to become scattered in the database (fragmented).
For queries that read many pages using full or range index scans, heavily fragmented indexes can degrade query performance when additional I/O is required to read the data. Instead of a small number of large I/O requests, the query would require a larger number of small I/O requests to read the same amount of data.
When the storage subsystem provides better sequential I/O performance than random I/O performance, index fragmentation can degrade performance because more random I/O is required to read fragmented indexes.
What is page density (also known as page fullness) and how it impacts performance:
Tip
In many workloads, increasing page density results in a greater positive performance impact than reducing fragmentation.
To avoid lowering page density unnecessarily, Microsoft does not recommend setting fill factor to values other than 100 or 0, except in certain cases for indexes experiencing a high number of page splits, for example frequently modified indexes with leading columns containing non-sequential GUID values.
Both fragmentation and page density are among the factors to consider when deciding whether to perform index maintenance, and which maintenance method to use.
Fragmentation is defined differently for rowstore and columnstore indexes. For rowstore indexes, sys.dm_db_index_physical_stats() lets you determine fragmentation and page density in a specific index, all indexes on a table or indexed view, all indexes in a database, or all indexes in all databases. For partitioned indexes, sys.dm_db_index_physical_stats()
provides this information for each partition.
The result set returned by sys.dm_db_index_physical_stats
includes the following columns:
Column | Description |
---|---|
avg_fragmentation_in_percent |
Logical fragmentation (out-of-order pages in the index). |
avg_page_space_used_in_percent |
Average page density. |
For compressed row groups in columnstore indexes, fragmentation is defined as the ratio of deleted rows to total rows, expressed as a percentage. sys.dm_db_column_store_row_group_physical_stats lets you determine the number of total and deleted rows per row group in a specific index, all indexes on a table, or all indexes in a database.
The result set returned by sys.dm_db_column_store_row_group_physical_stats
includes the following columns:
Column | Description |
---|---|
total_rows |
Number of rows physically stored in the row group. For compressed row groups, this includes the rows that are marked as deleted. |
deleted_rows |
Number of rows physically stored in a compressed row group that are marked for deletion. 0 for row groups that are in delta store. |
Compressed row group fragmentation in a columnstore index can be computed using this formula:
100.0*(ISNULL(deleted_rows,0))/NULLIF(total_rows,0)
Tip
For both rowstore and columnstore indexes, review index or heap fragmentation and page density after a large number of rows has been deleted or updated. For heaps, if there are frequent updates, review fragmentation periodically to avoid proliferation of forwarding records. For more information about heaps, see Heaps (Tables without Clustered Indexes).
See Examples for sample queries to determine fragmentation and page density.
You can reduce index fragmentation and increase page density by using one of the following methods:
Note
For partitioned indexes, you can use either of the following methods on all partitions or a single partition of an index.
Reorganizing an index is less resource intensive than rebuilding an index. For that reason it should be your preferred index maintenance method, unless there is a specific reason to use index rebuild. Reorganize is always an online operation. This means long-term object-level locks are not held and queries or updates to the underlying table can continue during the ALTER INDEX ... REORGANIZE
operation.
Note
Starting with SQL Server 2019 (15.x), Azure SQL Database, and Azure SQL Managed Instance, the tuple-mover is helped by a background merge task that automatically compresses smaller open delta rowgroups that have existed for some time as determined by an internal threshold, or merges compressed rowgroups from where a large number of rows has been deleted. This improves the columnstore index quality over time. For most cases this dismisses the need for issuing ALTER INDEX ... REORGANIZE
commands.
Tip
If you cancel a reorganize operation, or if it is otherwise interrupted, the progress it made to that point is persisted in the database. To reorganize large indexes, the operation can be started and stopped multiple times until it completes.
Rebuilding an index drops and re-creates the index. Depending on the type of index and the Database Engine version, a rebuild operation can be done offline or online. An offline index rebuild usually takes less time than an online rebuild, but it holds object-level locks for the duration of the rebuild operation, blocking queries from accessing the table or view.
An online index rebuild does not require object-level locks until the end of the operation, when a lock must be held for a short duration to complete the rebuild. Depending on the version of the Database Engine, an online index rebuild can be started as a resumable operation. A resumable index rebuild can be paused, keeping the progress made to that point. A resumable rebuild operation can be resumed after having been paused or interrupted, or aborted if completing the rebuild becomes unnecessary.
For Transact-SQL syntax, see ALTER INDEX REBUILD. For more information about online index rebuilds, see Perform Index Operations Online.
Note
While an index is being rebuilt online, every modification of data in indexed columns must update an additional copy of the index. This can result in a minor performance degradation of data modification statements during online rebuild.
If an online resumable index operation is paused, this performance impact persists until the resumable operation either completes or is aborted. If you do not intend to complete a resumable index operation, abort it instead of pausing it.
Tip
Depending on available resources and workload patterns, specifying a higher than the default MAXDOP
value in the ALTER INDEX REBUILD statement can shorten the duration of rebuild at the expense of higher CPU utilization.
For rowstore indexes, rebuilding removes fragmentation in all levels of the index, and compacts pages based on the specified or current fill factor. When ALL
is specified, all indexes on the table are dropped and rebuilt in a single operation. When indexes with 128 or more extents are rebuilt, the Database Engine defers page deallocations and acquiring the associated locks until after the rebuild completes. For syntax examples, see Examples - Rowstore rebuild.
For columnstore indexes, rebuilding removes fragmentation, moves any delta store rows into columnstore, and physically deletes rows that have been marked for deletion. For syntax examples, see Examples - Columnstore rebuild.
Tip
Starting with SQL Server 2016 (13.x), rebuilding the columnstore index is usually not needed since REORGANIZE
performs the essentials of a rebuild as an online operation.
Prior to SQL Server 2008 (10.0.x), you could sometimes rebuild a rowstore nonclustered index to correct inconsistencies due to data corruption in the index.
You can still repair such inconsistencies in the nonclustered index by rebuilding a nonclustered index offline. However, you cannot repair nonclustered index inconsistencies by rebuilding the index online, because the online rebuild mechanism uses the existing nonclustered index as the basis for the rebuild and thus carries over the inconsistency. Rebuilding the index offline can sometimes force a scan of the clustered index (or heap) and so replace the inconsistent data in the nonclustered index with the data from the clustered index or heap.
To ensure that the clustered index or heap is used as the source of data, drop and recreate the nonclustered index instead of rebuilding it. As with earlier versions, you can recover from inconsistencies by restoring the affected data from a backup. However, you might be able to repair nonclustered index inconsistencies by rebuilding it offline or recreating it. For more information, see DBCC CHECKDB (Transact-SQL).
Use solutions such as Adaptive Index Defrag to automatically manage index fragmentation and statistics updates for one or more databases. This procedure automatically chooses whether to rebuild or reorganize an index according to its fragmentation level, among other parameters, and update statistics with a linear threshold.
The following scenarios cause all rowstore nonclustered indexes on a table to be automatically rebuilt:
CREATE CLUSTERED INDEX ... WITH (DROP_EXISTING = ON)
The following scenarios do not automatically rebuild all rowstore nonclustered indexes on the same table:
Important
An index cannot be reorganized or rebuilt if the filegroup on which it is located is offline or read-only. When the keyword ALL is specified and one or more indexes are on an offline or read-only filegroup, the statement fails.
While an index rebuild occurs, the physical media must have enough space to store two copies of the index. When the rebuild is finished, the Database Engine deletes the original index.
When ALL
is specified with the ALTER INDEX ... REORGANIZE
statement, clustered, nonclustered, and XML indexes on the table are reorganized.
Rebuilding or reorganizing small rowstore indexes usually does not reduce fragmentation. Up to, and including, SQL Server 2014 (12.x), the SQL Server Database Engine allocates space using mixed extents. Therefore, pages of small indexes are sometimes stored on mixed extents, which implicitly makes such indexes fragmented. Mixed extents are shared by up to eight objects, so the fragmentation in a small index might not be reduced after reorganizing or rebuilding it.
When rebuilding a columnstore index, the Database Engine reads all data from the original columnstore index, including the delta store. It combines data into new row groups, and compresses all row groups into columnstore. The Database Engine defragments the columnstore by physically deleting rows that have been marked as deleted.
Note
Starting with SQL Server 2019 (15.x), the tuple mover is helped by a background merge task that automatically compresses smaller open delta store row groups that have existed for some time as determined by an internal threshold, or merges compressed row groups where a large number of rows has been deleted. This improves columnstore index quality over time. For more information about columnstore terms and concepts, see Columnstore indexes: Overview.
Rebuilding the entire table takes a long time if the index is large, and requires enough disk space to store an additional copy of the entire index during the rebuild.
For partitioned tables, you do not need to rebuild the entire columnstore index if fragmentation is present only in some partitions, for example in partitions where UPDATE
, DELETE
, or MERGE
statements have affected a large number of rows.
Rebuilding a partition after loading or modifying data ensures all data is stored in compressed row groups in columnstore. When the data load process inserts data into a partition using batches smaller than 102,400 rows, the partition can end up with multiple open row groups in delta store. Rebuilding moves all delta store rows into compressed row groups in columnstore.
When reorganizing a columnstore index, the Database Engine compresses each closed row group in delta store into columnstore as a compressed row group. Starting with SQL Server 2016 (13.x) and in Azure SQL Database, the REORGANIZE
command performs the following additional defragmentation optimizations online:
After performing data loads, you can have multiple small row groups in the delta store. You can use ALTER INDEX REORGANIZE
to force these row groups into columnstore, and then combine smaller compressed row groups into larger compressed row groups. The reorganize operation will also remove rows that have been marked as deleted from the columnstore.
Note
Reorganizing a columnstore index using Management Studio combines compressed row groups together, but does not force all row groups to be compressed into the columnstore. Closed row groups will be compressed, but open row groups will not be compressed into columnstore.
To forcibly compress all row groups, use the Transact-SQL example that includes COMPRESS_ALL_ROW_GROUPS = ON
.
Index maintenance, performed by either reorganizing or rebuilding an index, is resource-intensive. It causes a significant increase in CPU utilization, memory used, and storage I/O. However, depending on the database workload and other factors, the benefits it provides range from vitally important to minuscule.
To avoid unnecessary resource utilization that, avoid performing index maintenance indiscriminately. Instead, performance benefits from index maintenance should be determined empirically for each workload using the recommended strategy, and weighed against resource costs and workload impact needed to achieve these benefits.
The likelihood of seeing performance benefits from reorganizing or rebuilding an index is higher when the index is heavily fragmented, or when its page density is low. However, these are not the only things to consider. Factors such as query patterns (transaction processing vs. analytics and reporting), storage subsystem behavior, available memory, and database engine improvements over time all play a role.
Important
Index maintenance decisions should be made after considering multiple factors in the specific context of each workload, including the resource cost of maintenance. They should not be based on fixed fragmentation or page density thresholds alone.
Customers often observe performance improvements after rebuilding indexes. However, in many cases these improvements are unrelated to reducing fragmentation or increasing page density.
An index rebuild has an important benefit: it updates statistics on key columns of the index by scanning all rows in the index. This is the equivalent of executing UPDATE STATISTICS ... WITH FULLSCAN
, which makes statistics current and sometimes improves their quality compared to the default sampled statistics update. When statistics are updated, query plans that reference them are recompiled. If the previous plan for a query was not optimal because of stale statistics, insufficient statistics sampling ratio, or for other reasons, the recompiled plan will often perform better.
Customers often incorrectly attribute this improvement to the index rebuild itself, taking it to be result of reduced fragmentation and increased page density. In reality, the same benefit can often be achieved at much cheaper resource cost by updating statistics instead of rebuilding indexes.
Tip
Resource cost of updating statistics is minor compared to index rebuild, and the operation often completes in minutes. Index rebuilds can take hours.
Microsoft recommends that customers consider and adopt the following index maintenance strategy:
WITH SAMPLE ... PERCENT
or WITH FULLSCAN
clauses (this is not common).In addition to the above considerations and strategy, in Azure SQL Database and Azure SQL Managed Instance it is particularly important to consider the costs and benefits of index maintenance. Customers should perform it only when there is a demonstrated need, and taking into account the following points.
There are specific but uncommon scenarios when one-time or periodic index maintenance may be needed in Azure SQL Database and Azure SQL Managed Instance:
Tip
If you have determined that index maintenance is necessary for your Azure SQL Database and Azure SQL Managed Instance workloads, you should either reorganize indexes, or use online index rebuild. This lets query workloads access tables while indexes are being rebuilt.
Additionally, making the operation resumable lets you avoid restarting it from the beginning if it gets interrupted by a planned or unplanned database failover. Using resumable index operations is particularly important when indexes are large.
Tip
Offline index operations typically complete faster than online operations. They should be used when tables will not be accessed by queries during the operation, for example after loading data into staging tables as part of a sequential ETL process.
Rowstore indexes with more than 128 extents are rebuilt in two separate phases: logical and physical. In the logical phase, the existing allocation units used by the index are marked for deallocation, the data rows are copied and sorted, then moved to new allocation units created to store the rebuilt index. In the physical phase, the allocation units previously marked for deallocation are physically dropped in short transactions that happen in the background, and do not require many locks. For more information about allocation units, see Pages and Extents Architecture Guide.
The ALTER INDEX REORGANIZE
statement requires the data file containing the index to have space available, because the operation can only allocate temporary work pages in the same file, not in another file within the same filegroup. Even though the filegroup has free space available, the user can still encounter error 1105: Could not allocate space for object '###' in database '###' because the '###' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup
during the reorganize operation if a data file is out of space.
An index cannot be reorganized when ALLOW_PAGE_LOCKS
is set to OFF.
Up to SQL Server 2017 (14.x), rebuilding a clustered columnstore index is an offline operation. The Database Engine has to acquire an exclusive lock on the table or partition while the rebuild occurs. The data is offline and unavailable during the rebuild even when using NOLOCK
, read-committed snapshot isolation (RCSI), or snapshot isolation. Starting with SQL Server 2019 (15.x), a clustered columnstore index can be rebuilt using the ONLINE = ON
option.
Warning
Creating and rebuilding nonaligned indexes on a table with more than 1,000 partitions is possible, but is not supported. This can cause degraded performance or excessive memory consumption during these operations. Microsoft recommends using only aligned indexes when the number of partitions exceeds 1,000.
FULLSCAN
clause in CREATE STATISTICS
or UPDATE STATISTICS
. However, starting with SQL Server 2012 (11.x), when a partitioned index is created or rebuilt, statistics are not created or updated by scanning all rows in the table. Instead, the default sampling ratio is used. To create or update statistics on partitioned indexes by scanning all rows in the table, use CREATE STATISTICS or UPDATE STATISTICS with the FULLSCAN
clause.PERSIST_SAMPLE_PERCENT
clause set to ON
, resumable index operations use the persisted sampling ratio to create or update statistics.The following example determines the average fragmentation and page density for all rowstore indexes in the current database. It uses the SAMPLED
mode to return actionable results quickly. For more accurate results, use the DETAILED
mode. This requires scanning all index pages, and can take a long time.
SELECT OBJECT_SCHEMA_NAME(ips.object_id) AS schema_name,
OBJECT_NAME(ips.object_id) AS object_name,
i.name AS index_name,
i.type_desc AS index_type,
ips.avg_fragmentation_in_percent,
ips.avg_page_space_used_in_percent,
ips.page_count,
ips.alloc_unit_type_desc
FROM sys.dm_db_index_physical_stats(DB_ID(), default, default, default, 'SAMPLED') AS ips
INNER JOIN sys.indexes AS i
ON ips.object_id = i.object_id
AND
ips.index_id = i.index_id
ORDER BY page_count DESC;
The previous statement returns a result set similar to the following:
schema_name object_name index_name index_type avg_fragmentation_in_percent avg_page_space_used_in_percent page_count alloc_unit_type_desc
------------ --------------------- ---------------------------------------- ------------- ---------------------------- ------------------------------ ----------- --------------------
dbo FactProductInventory PK_FactProductInventory CLUSTERED 0.390015600624025 99.7244625648629 3846 IN_ROW_DATA
dbo DimProduct PK_DimProduct_ProductKey CLUSTERED 0 89.6839757845318 497 LOB_DATA
dbo DimProduct PK_DimProduct_ProductKey CLUSTERED 0 80.7132814430442 251 IN_ROW_DATA
dbo FactFinance NULL HEAP 0 99.7982456140351 239 IN_ROW_DATA
dbo ProspectiveBuyer PK_ProspectiveBuyer_ProspectiveBuyerKey CLUSTERED 0 98.1086236718557 79 IN_ROW_DATA
dbo DimCustomer IX_DimCustomer_CustomerAlternateKey NONCLUSTERED 0 99.5197553743514 78 IN_ROW_DATA
For more information, see sys.dm_db_index_physical_stats.
The following example determines the average fragmentation for all columnstore indexes with compressed row groups in the current database.
SELECT OBJECT_SCHEMA_NAME(i.object_id) AS schema_name,
OBJECT_NAME(i.object_id) AS object_name,
i.name AS index_name,
i.type_desc AS index_type,
100.0 * (ISNULL(SUM(rgs.deleted_rows), 0)) / NULLIF(SUM(rgs.total_rows), 0) AS avg_fragmentation_in_percent
FROM sys.indexes AS i
INNER JOIN sys.dm_db_column_store_row_group_physical_stats AS rgs
ON i.object_id = rgs.object_id
AND
i.index_id = rgs.index_id
WHERE rgs.state_desc = 'COMPRESSED'
GROUP BY i.object_id, i.index_id, i.name, i.type_desc
ORDER BY schema_name, object_name, index_name, index_type;
The previous statement returns a result set similar to the following:
schema_name object_name index_name index_type avg_fragmentation_in_percent
------------ ---------------------- ------------------------------------ ------------------------- ----------------------------
Sales InvoiceLines NCCX_Sales_InvoiceLines NONCLUSTERED COLUMNSTORE 0.000000000000000
Sales OrderLines NCCX_Sales_OrderLines NONCLUSTERED COLUMNSTORE 0.000000000000000
Warehouse StockItemTransactions CCX_Warehouse_StockItemTransactions CLUSTERED COLUMNSTORE 4.225346161484279
Note
For more examples about using Transact-SQL to rebuild or reorganize indexes, see ALTER INDEX Examples - Rowstore Indexes and ALTER INDEX Examples - Columnstore Indexes.
The following example reorganizes the IX_Employee_OrganizationalLevel_OrganizationalNode
index on the HumanResources.Employee
table in the AdventureWorks2022
database.
ALTER INDEX IX_Employee_OrganizationalLevel_OrganizationalNode
ON HumanResources.Employee
REORGANIZE;
The following example reorganizes the IndFactResellerSalesXL_CCI
columnstore index on the dbo.FactResellerSalesXL_CCI
table in the AdventureWorksDW2022
database. This command forces all closed and open row groups into columnstore.
-- This command forces all closed and open row groups into columnstore.
ALTER INDEX IndFactResellerSalesXL_CCI
ON FactResellerSalesXL_CCI
REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON);
The following example reorganizes all indexes on the HumanResources.Employee
table in the AdventureWorks2022
database.
ALTER INDEX ALL ON HumanResources.Employee
REORGANIZE;
The following example rebuilds a single index on the Employee
table in the AdventureWorks2022
database.
ALTER INDEX PK_Employee_BusinessEntityID ON HumanResources.Employee
REBUILD
;
The following example rebuilds all indexes associated with the table in the AdventureWorks2022
database using the ALL
keyword. Three options are specified.
ALTER INDEX ALL ON Production.Product
REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON,
STATISTICS_NORECOMPUTE = ON)
;
For more information, see ALTER INDEX.
Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate SQL, Power BI, Fabric, and AI community-led event. March 31 - April 2. Use code MSCUST for a $150 discount.
Register todayTraining
Learning path
Use advance techniques in canvas apps to perform custom updates and optimization - Training
Use advance techniques in canvas apps to perform custom updates and optimization