I/O Architecture

05/26/2006

The primary purpose of a database is to store and retrieve data. Therefore, performing lots of disk reads and writes is one of the intrinsic characteristics of a Database Engine. Disk I/O operations consume many resources and take a relatively long time to finish. Much of the logic in relational database software focuses on making the pattern of I/O use highly efficient.

SQL Server 2005 allocates much of its virtual memory to a buffer cache and uses the cache to reduce physical I/O. Each instance of SQL Server has its own buffer cache. Data is read from the database disk files into the buffer cache. Multiple logical reads of the data can be satisfied without requiring that the data be physically read again. The data remains in the cache until it has not been referenced for some time and the database needs the buffer area to read in more data. Data is written back to disk only if it is modified. Data can be changed multiple times by logical writes before a physical write transfers the new data back to disk.

The data in a SQL Server database is stored in 8-KB pages. Each group of eight contiguous pages is a 64-KB extent. The buffer cache is also divided into 8-KB pages. For more information, see Pages and Extents.

The I/O from an instance of SQL Server is divided into logical and physical I/O. A logical read occurs every time the Database Engine requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read is then performed to read the page into the buffer cache. If the page is currently in the cache, no physical read is generated. In this situation, the buffer cache just uses the page already in memory. A logical write occurs when data is modified in a page in memory. A physical write occurs when the page is written to disk. It is possible for a page to remain in memory long enough to have more than one logical write made before it is physically written to disk.

One of the basic performance optimization tasks for an instance of SQL Server involves sizing the SQL Server memory. The goal is to make the buffer cache large enough to maximize the ratio of logical reads to physical reads, but not so large that excessive memory swapping starts generating physical I/O to the pagefile. Instances of SQL Server do this automatically under the default configuration settings.

By maintaining a relatively large buffer cache in virtual memory, an instance of SQL Server can significantly reduce the number of physical disk reads it requires. After a frequently referenced page has been read into the buffer cache, it is likely to stay there. This eliminates additional reads.

SQL Server 2005 uses two features that are used in Microsoft Windows 2000, Windows Server 2003, and Windows XP to improve its disk I/O performance. These include the following:

Scatter-gather I/O
Before scatter-gather I/O was introduced in Windows NT version 4.0 Service Pack 2, all the data for a disk read or write on Windows NT had to be in a contiguous area of memory. If a read transferred in 64 KB of data, the read request had to specify the address of a contiguous area of 64 KB of memory. Scatter-gather I/O lets a read or write transfer data in to or out of noncontiguous areas of memory. Windows 2000, Windows Server 2003, and Windows XP support scatter-gather I/O.
If an instance of SQL Server reads in a 64 KB extent, it does not have to allocate a single 64 KB area and then copy the individual pages to buffer cache pages. It can locate eight buffer pages and then do a single scatter-gather I/O specifying the address of the eight buffer pages. The operating system puts the eight pages directly into the buffer pages. This eliminates the need for the instance of SQL Server to do a separate memory copy.
Asynchronous I/O
In an asynchronous I/O, an application requests a read or write operation from Microsoft Windows 2000,Windows 2003, or Windows XP. The operating system immediately returns control to the application. The application can then perform additional work and later test to see if the read or write was completed. By contrast, in a synchronous I/O, the operating system does not return control to the application until the read or write has been completed. Using asynchronous I/O lets instances of SQL Server maximize the work performed by individual threads while they are processing a batch.

SQL Server supports multiple concurrent asynchronous I/O operations against each file. The Database Engine dynamically determines the maximum number of I/O operations an instance can issue for any file.

Share via

See Also

Concepts

Help and Information

Ask Learn

Share via

I/O Architecture

See Also

Concepts

Help and Information

Additional resources