Windows Collation Sorting Styles

During SQL Server Setup, you will be prompted to select either Windows collations or binary collations. Your choice of collations affects the data comparison and sort order behaviors of your instance of Microsoft SQL Server. This topic provides considerations for choosing Microsoft Windows collations or binary collations for your SQL Server implementation.

Note

For Windows collations, the nchar, nvarchar, and ntext Unicode data types have the same sorting behavior as char, varchar, and text non-Unicode data types.

Windows collations define rules for storing character data based on an associated Windows locale. The base Windows collation rules specify which alphabet or language is used when dictionary sorting is applied, as well as the code page used to store non-Unicode character data. Binary collations sort data based on the sequence of coded values defined by the locale and data type. A binary collation in SQL Server defines the locale and the ANSI code page to be used, enforcing a binary sort order. Binary collations are useful in achieving improved application performance due to their relative simplicity. For non-Unicode data types, data comparisons are based on the code points defined in the ANSI code page. For Unicode data types, data comparisons are based on the Unicode code points. For binary collations on Unicode data types, the locale is not considered in data sorts. For example, Latin_1_General_BIN and Japanese_BIN yield identical sorting results when used on Unicode data.

Previous binary collations in SQL Server performed an incomplete code-point-to-code-point comparison for Unicode data, in that older SQL Server binary collations compared the first character as WCHAR, followed by a byte-by-byte comparison. For backward compatibility reasons, existing binary collation semantics will not be changed.

Binary collations in this release of SQL Server also include a set of pure code-point comparison collations. Customers can choose to migrate to the binary collations to take advantage of true code-point comparisons, and they should utilize the binary collations for development of new applications. The BIN2 suffix identifies collation names that implement the code-point collation semantics. There is also a comparison flag that corresponds to BIN2 for the new binary sort. For more information, see Guidelines for Using BIN and BIN2 Collations.

The following table describes Windows collation sort order options for SQL Server.

Sort order (suffix)

Sort order description

Binary (_BIN)1

Sorts and compares data in SQL Server tables based on the bit patterns defined for each character. Binary sort order is case-sensitive and accent-sensitive. Binary is also the fastest sorting order. For more information, see Guidelines for Using BIN and BIN2 Collations.

If this option is not selected, SQL Server follows sorting and comparison rules as defined in dictionaries for the associated language or alphabet.

Binary-code point (_BIN2)1

Sorts and compares data in SQL Server tables based on Unicode code points for Unicode data. For non-Unicode data, Binary-code point will use comparisons identical to binary sorts.

The advantage of using a Binary-code point sort order is that no data resorting is required in applications that compare sorted SQL Server data. As a result, a Binary-code point sort order provides simpler application development and possible performance increases. For more information, see Guidelines for Using BIN and BIN2 Collations.

Case-sensitive (_CS)

Distinguishes between uppercase and lowercase letters. If selected, lowercase letters sort ahead of their uppercase versions.

If this option is not selected, SQL Server considers the uppercase and lowercase versions of letters to be identical for sorting purposes.

Accent-sensitive (_AS)

Distinguishes between accented and unaccented characters. For example, 'a' is not equal to 'ấ'.

If this option is not selected, SQL Server considers the accented and unaccented versions of letters to be identical for sorting purposes.

Kana-sensitive (_KS)

Distinguishes between the two types of Japanese kana characters: Hiragana and Katakana.

If this option is not selected, SQL Server considers Hiragana and Katakana characters to be equal for sorting purposes.

Width-sensitive (_WS)

Distinguishes between a single-byte character and the same character when represented as a double-byte character.

If this option is not selected, SQL Server considers the single-byte and double-byte representation of the same character to be identical for sorting purposes.

1 If selected, the case-sensitive, accent-sensitive, kana-sensitive, and width-sensitive options are not available.

Examples of Windows Collations

Each Windows collation is combined as a series of suffixes to define case, accent, width, or kana sensitivity. The following examples describe sort order behavior for various combinations of suffixes.

Windows collation suffix

Sort order description

_BIN1

Binary sort.

_BIN21

Binary-code point sort order.

_CI_AI

Case-insensitive, accent-insensitive, kana-insensitive, width-insensitive.

_CI_AI_KS

Case-insensitive, accent-insensitive, kana-sensitive, width-insensitive

_CI_AI_KS_WS

Case-insensitive, accent-insensitive, kana-sensitive, width-sensitive

_CI_AI_WS

Case-insensitive, accent-insensitive, kana-insensitive, width-sensitive

_CI_AS

Case-insensitive, accent-sensitive, kana-insensitive, width-insensitive

_CI_AS_KS

Case-insensitive, accent-sensitive, kana-sensitive, width-insensitive

_CI_AS_KS_WS

Case-insensitive, accent-sensitive, kana-sensitive, width-sensitive

_CI_AS_WS

Case-insensitive, accent-sensitive, kana-insensitive, width-sensitive

_CS_AI

Case-sensitive, accent-insensitive, kana-insensitive, width-insensitive

_CS_AI_KS

Case-sensitive, accent-insensitive, kana-sensitive, width-insensitive

_CS_AI_KS_WS

Case-sensitive, accent-insensitive, kana-sensitive, width-sensitive

_CS_AI_WS

Case-sensitive, accent-insensitive, kana-insensitive, width-sensitive

_CS_AS

Case-sensitive, accent-sensitive, kana-insensitive, width-insensitive

_CS_AS_KS

Case-sensitive, accent-sensitive, kana-sensitive, width-insensitive

_CS_AS_KS_WS

Case-sensitive, accent-sensitive, kana-sensitive, width-sensitive

_CS_AS_WS

Case-sensitive, accent-sensitive, kana-insensitive, width-sensitive

1 If BIN or Binary-code point is selected, the Case-sensitive, Accent-sensitive, Kana-sensitive, and Width-sensitive options are not available.