NTFS File System [Files]

The NTFS file system is the preferred file system on Windows Server 2003, Windows XP, Windows 2000, and Windows NT. It is designed to address the requirements of high-performance file servers and server networks as well as desktop computers, and in doing so, address many of the limitations of the earlier FAT16 and FAT32 file systems. The most important of these requirements are the following:

  • Data recoverability. The NTFS file system limits the possibility of data corruption by organizing I/O operations by transactions. Transactions are atomic, which means that either the entire I/O operation must complete or none of it can complete. If anything interrupts the transaction in-progress, such as loss of power to the computer or a cancellation of the I/O operation, the NTFS file system does everything possible to guarantee that any changes made to the file system as part of the I/O operation are undone, or rolled back, returning the file system to its condition before the I/O operation began.

    Also, the NTFS file system is a fully recoverable file system. It is designed to restore consistency to a disk after a CPU failure, system crash, or I/O error. The NTFS file system allows the operating system to recover without your having to use disk-checking utilities. However, the NTFS file system provides some disk utilities in case recovery fails or corruption occurs outside the control of the file system.

  • Storage fault tolerance. Data-redundant storage methods can be used with the NTFS file system to ensure that if data is corrupted on one physical disk, an intact copy can be retrieved from the disk mirror. The NTFS file system always uses data redundancy to protect internal data structures containing metadata vital to the integrity of the file system.

  • Data security. The NTFS file system implements files and directories as securable objects according to the Windows object security architecture. Access to file and directory objects in the NTFS file system can be restricted to specific users and groups under this architecture. For more information on this, see File Security and Access Rights. Data security features for files and directories are not included in the FAT file systems.

Other advanced features provided by the NTFS file system are the following:

  • Multiple data streams. As mentioned in the Files section, NTFS file system files can consist of more than one stream. The additional streams may contain any kind of data, although typically it is data describing the file, or metadata. For more information about multiple data streams in NTFS file system files, see the Files section.

  • Unicode names. Unicode is the standard character set used in the NTFS file system and replaces the older single-byte ASCII character set. Every character used in every major natural language is represented by a unique double-byte number in the Unicode set.

    For more information about Unicode, see About Unicode and Character Sets.

  • Improved file attribute indexing. The NTFS file system includes the ability to index file attributes as a means of locating and sorting multiple files that share similar data quickly. You can index file names in FAT32 and FAT16 file systems, but not attributes. Also, the file system does not have the functionality to sort the indexed FAT32 and FAT16 file system names.

  • Dynamic bad-cluster remapping. When a read operation on an NTFS file system volume that is not fault tolerant encounters corrupted data on a cluster of sectors, each sector in the cluster is flagged as bad, and subsequent attempts to perform read operations on that sector result in the error being returned. In the same scenario on FAT file systems, the file system itself does not flag bad sectors—the user must run the Chkdsk.exe utility to do this.

    When this scenario occurs on fault-tolerant NTFS file system volumes, the file system does the following for each bad sector that it encounters:

    1. Recover the uncorrupted data from a secondary source on the volume.
    2. Locate a good sector and write the recovered data to it.
    3. Remap the bad sector to the new good sector so that all subsequent attempts to perform I/O operations on the bad sector are automatically redirected to the new one.
  • Hard links and junctions. Hard links and junctions are two ways that storage objects can be linked under the NTFS file system. For more information on both, see Hard Links and Junctions.

  • Compression and sparse file support. The NTFS file system volumes support file compression on an individual file basis. The file compression algorithm used by the NTFS file system is Lempel-Ziv compression. This is a lossless compression algorithm, which means that no data is lost when compressing and decompressing the file, as opposed to lossy compression algorithms such as JPEG, where some data is lost each time data compression and decompression occur.

    For more information, see File Compression and Decompression.

    A file in which most of the data is zeros is called a sparse file. The NTFS file system implements a form of file compression specifically for sparse files in which only nonzero data is written to the file and the file system provides the correct amount of zero data to applications at need. For more information, see Sparse Files.

  • Change journals. The NTFS file system creates and maintains change journals for each volume, to track all changes made to it. For more information, see Change Journals.

  • Distributed link tracking. The Windows shell allows a user to create files on their desktop that link to applications that reside in another location in the volume. The Start menu, which the user can configure, contains many instances of this kind of link. Also, the object linking and embedding, or OLE, technology allows applications to embed links to external files within the files they create and maintain. The components of the Office 2000 suite—Word, PowerPoint, and Excel—are examples of applications that use OLE technology.

    A problem arises in the previous examples when the file being linked to (the link source) is moved, making it inaccessible through the link—also referred to as the link client. Distributed link tracking is first introduced in the version of the NTFS file system that is shipped with Windows 2000 to enable client applications to track link sources that have been moved. As a result, applications and users that create links do not have to maintain the linkage themselves when the link source is moved.

    For more information on distributed link tracking, see Distributed Link Tracking and Object Identifiers.

  • Encryption. The NTFS file system provides the Encrypted File System, or EFS, for cryptographic protection of files and directories. For more information on EFS, see File Encryption. For more information on cryptography in general, see Cryptography Reference.

  • POSIX support. The following POSIX functionality is introduced in Windows 2000:

    1. Files can be accessed in NTFS file systems according to POSIX naming conventions. POSIX conventions allow file names that have trailing spaces, file names that have trailing periods (.), and file names that are identical except for the case of the characters.
    2. Traversal permissions, where the security attributes of each parent directory in the path of a file or directory is used in determining whether a specific user has access to it.
    3. "File change time" time stamps.
    4. POSIX-style hard links. For a code example that shows how to back up and restore these type of links, see Backing Up and Restoring POSIX File Links.
  • Defragmentation API. A file is stored on a disk drive and other storage media in one or more clusters. Clusters are the atomic unit of data allocation, made up of one or more sectors. Sectors are physical storage units.

    As a file is written to the disk, the file may not be written in contiguous clusters. Noncontiguous clusters slow down the process of reading and writing the file. The farther apart on the disk the noncontiguous clusters are, the slower the process because of the increased time it takes to move the hard drive's read/write head. A file with noncontiguous clusters is said to be fragmented. To optimize files for fast access, a volume may be defragmented.

    Defragmentation is the process of moving a file's clusters on the disk to make them contiguous. The NTFS file system does not perform defragmentation, but with version 5.0 it provides the ability for applications to perform defragmentation by calling an API. This API consists of functions that allow applications to obtain a map of the clusters that are in use and clusters that are not being used, obtain a map of how a file is using its clusters, and move a file.

    For more information, see Defragmenting Files.

  • Reparse points. Under the NTFS file system, a file or directory can contain a reparse point, which is a collection of user-defined data. For more information on reparse points, see Reparse Points.

  • Directories as volume mount points. Volume mount points are directories on a volume that an application can use to "mount" a different volume, that is, to set it up for use at the location a user specifies. In other words, you can use a volume mount point as a gateway to a volume. When a volume is mounted at a volume mount point, users and applications can see the mounted volume by the path of the volume mount point or a drive letter. For example, with a volume mount point set, the user might see drive D as "C:\mnt\Ddrive" as well as "D:".

    Using volume mount points, you can unify into one logical file system disparate file systems such as the NTFS file system, a 16-bit FAT file system, an ISO-9660 file system on a CD-ROM drive, and so on. Neither users nor applications need information about the volume on which a specific file resides. All the information they need to locate a specified file is a complete path. Volumes can be rearranged, substituted, or subdivided into many volumes without users or applications needing to change settings.

    For more information, see Volume Mount Points.

Default Cluster Sizes

The NTFS file system uses 64-bit cluster indexes. This capacity gives the NTFS file system the ability to address volumes of up to 16 exabytes (16 billion GB); however, Windows XP and Windows Server 2003 limit the size of an NTFS file system volume to that addressable with 32-bit clusters, which is 128 TB (using 64-KB clusters). The following table identifies the default volume and cluster sizes for NTFS file system volumes.

Volume Cluster
0 to 512 MB 512 bytes
513 MB to 1 GB 1 KB
1025 MB to 2 GB 2 KB
2 GB and greater 4 KB

Note  2 TB and greater require file systems that do not have a master boot record (MBR)—that is, it is dynamic.

You can override these defaults when you format an NTFS file system volume.

NTFS file system file names can be any practical length (up to 255 characters). There is no requirement that NTFS file system file names have extensions; however, many applications still create and use them. For more information, see Naming a File.