Schema Syntax for XML Format Files

This section describes the syntax of XML format files. To see how the syntax corresponds to actual XML format files, see Sample XML Format Files. This section also considers how bulk import uses the <ROW> and <COLUMN> elements and how to put the xsi:type value of an element into a data set.

You can use an XML format file with a bcp command, BULK INSERT statement, or INSERT ... SELECT * FROM OPENROWSET(BULK...) statement.

Note

   You can modify a format file to let you bulk import from a data file in which the number and/or order of the fields differ from the number and/or order of table columns. For more information, see Using a Format File to Map Fields to Columns During Bulk Import.

Basic Syntax of the XML Schema

This syntax statements show only the elements (<BCPFORMAT>, <RECORD>, <FIELD>, <ROW>, and <COLUMN>) and their basic attributes.

<BCPFORMAT ...>

   <RECORD>

      <FIELD ID = "fieldID" xsi:type = "fieldType" [...]

      />

   </RECORD>

   <ROW>

      <COLUMN SOURCE = "fieldID" NAME = "columnName" xsi:type = "columnType" [...]

      />

   </ROW>

</BCPFORMAT>

Note

Additional attributes that are associated with the value of the xsi:type in a <FIELD> or <COLUMN> element are described later in this topic.

Description of the Schema Elements

This section summarizes the purpose of each element that the XML schema defines for XML format files. The attributes are described in separate sections later in this topic.

  • <BCPFORMAT>
    Is the format-file element that defines the record structure of a given data file and its correspondence to the columns of a table row in the table.

  • <RECORD .../>
    Defines a complex element containing one or more <FIELD> elements. The order in which the fields are declared in the format file is the order in which those fields appear in the data file.

  • <FIELD .../>
    Defines a field in data file, which contains data.

    The attributes of this element are discussed in "Attributes of the <FIELD> Element," later in this topic.

  • <ROW .../>
    Defines a complex element containing one or more <COLUMN> elements. The order of the <COLUMN> elements is independent of the order of <FIELD> elements in a RECORD definition. Rather, the order of the <COLUMN> elements in a format file determines the column order of the resultant rowset. Data fields are loaded in the order in which the corresponding <COLUMN> elements are declared in the <COLUMN> element.

    For more information, see "How Bulk Import Uses the <ROW> Element," later in this topic.

  • <COLUMN>
    Defines a column as an element (<COLUMN>). Each <COLUMN> element corresponds to a <FIELD> element (whose ID is specified in the SOURCE attribute of the <COLUMN> element).

    The attributes of this element are discussed in "Attributes of the <COLUMN> Element," later in this topic. Also see, "How Bulk Import Uses the <COLUMN> Element," later in this topic.

  • </BCPFORMAT>
    Required to end the format file.

Attributes of the <FIELD> Element

This section describes the attributes of the <FIELD> element, which are summarized in the following schema syntax:

<FIELD

   ID ="fieldID"

   xsi**:**type ="fieldType"

   [ LENGTH ="n" ]

   [ PREFIX_LENGTH ="p" ]

   [ MAX_LENGTH ="m" ]

   [ COLLATION ="collationName" ]

   [ TERMINATOR ="terminator" ]

/>

Each <FIELD> element is independent of the others. A field is described in terms of the following attributes:

FIELD Attribute

Description

Optional /

Required

ID ="fieldID"

Specifies the logical name of the field in the data file. The ID of a field is the key used to refer to the field.

<FIELD ID="fieldID"/> maps to <COLUMN SOURCE="fieldID"/>

Required

xsi:type ="fieldType"

This is an XML construct (used like an attribute) that identifies the type of the instance of the element. The value of fieldType determines which of the optional attributes (below) you need in a given instance.

Required (depending on the data type)

LENGTH ="n"

This attribute defines the length for an instance of a fixed-length data type.

The value of n must be a positive integer.

Optional unless required by the xsi:type value

PREFIX_LENGTH ="p"

This attribute defines the prefix length for a binary data representation. The PREFIX_LENGTH value, p, must be one of the following: 1, 2, 4, or 8.

Optional unless required by the xsi:type value

MAX_LENGTH ="m"

This attribute is the maximum number of bytes that can be stored in a given field. Without a target table, the column max-length is not known. The MAX_LENGTH attribute restricts the maximum length of an output character column, limiting the storage allocated for the column value. This is especially convenient when using the OPENROWSET function's BULK option in a SELECT FROM clause.

The value of m must be a positive integer. By default, the maximum length is 8000 characters for a char column and 4000 characters for an nchar column.

Optional

COLLATION ="collationName"

COLLATION is only allowed for character fields. For a list of the SQL collation names, see SQL Server Collation Name (Transact-SQL).

Optional

TERMINATOR = "terminator"

This attribute specifies the terminator of a data field. The terminator can be any character. The terminator must be a unique character that is not part of the data.

By default, the field terminator is the tab character (represented as \t). To represent a paragraph mark, use \r\n.

Used only with an xsi:type of character data, which requires this attribute

Xsi:type values of the <FIELD> Element

The xsi:type value is an XML construct (used like an attribute) that identifies the data type of an instance of an element. For information on using the "Putting the xsi:type Value into a Data Set," later in this section.

The xsi:type value of the <FIELD> element supports the following data types.

<FIELD> xsi:type values

Required XML Attribute(s)

for Data Type

Optional XML Attribute(s)

for Data Type

NativeFixed

LENGTH

None.

NativePrefix

PREFIX_LENGTH

MAX_LENGTH

CharFixed

LENGTH

COLLATION

NCharFixed

LENGTH

COLLATION

CharPrefix

PREFIX_LENGTH

MAX_LENGTH, COLLATION

NCharPrefix

PREFIX_LENGTH

MAX_LENGTH, COLLATION

CharTerm

TERMINATOR

MAX_LENGTH, COLLATION

NCharTerm

TERMINATOR

MAX_LENGTH, COLLATION

For more information about Microsoft SQL Server data types, see Data Types (Transact-SQL).

Attributes of the <COLUMN> Element

This section describes the attributes of the <COLUMN> element, which are summarized in the following schema syntax:

<COLUMN

   SOURCE = "fieldID"

   NAME = "columnName"

   xsi:type = "columnType"

   [ LENGTH = "n" ]

   [ PRECISION = "n" ]

   [ SCALE = "value" ]

   [ NULLABLE = { "YES"

"NO" } ]

/>

A field is mapped to a column in the target table using the following attributes:

COLUMN Attribute

Description

Optional /

Required

SOURCE ="fieldID"

Specifies the ID of the field being mapped to the column.

<COLUMN SOURCE="fieldID"/> maps to <FIELD ID="fieldID"/>

Required

NAME = "columnName"

Specifies the name of the column in the row set represented by the format file. This column name is used to identify the column in the result set, and it need not correspond to the column name used in the target table.

Required

xsi:type ="ColumnType"

This is an XML construct (used like an attribute) that identifies the data type of the instance of the element. The value of ColumnType determines which of the optional attributes (below) you need in a given instance.

NoteNote
The possible values of ColumnType and their associated attributes are listed in the next table.

Optional

LENGTH ="n"

Defines the length for an instance of a fixed-length data type. LENGTH is used only when the xsi:type is a string data type.

The value of n must be a positive integer.

Optional (available only if the xsi:type is a string data type)

PRECISION ="n"

Indicates the number of digits in a number. For example, the number 123.45 has a precision of 5.

The value must be a positive integer.

Optional (available only if the xsi:type is a variable-number data type)

SCALE ="int"

Indicates the number of digits to the right of the decimal point in a number. For example, the number 123.45 has a scale of 2.

The value must be an integer.

Optional (available only if the xsi:type is a variable-number data type)

NULLABLE = { "YES"

"NO" }

Indicates whether a column can assume NULL values. This attribute is completely independent of FIELDS. However, if a column is not NULLABLE and field specifies NULL (by not specifying any value), a run-time error results.

The NULLABLE attribute is used only if you do a plain SELECT FROM OPENROWSET(BULK...) statement.

Optional (available for any data type)

Xsi:type values of the <COLUMN> Element

The xsi:type value is an XML construct (used like an attribute) that identifies the data type of an instance of an element. For information on using the "Putting the xsi:type Value into a Data Set," later in this section.

The <COLUMN> element supports native SQL data types, as follows:

Type Category

<COLUMN> Data Types

Required XML Attribute(s)

for Data Type

Optional XML Attribute(s)

for Data Type

Fixed

SQLBIT, SQLTINYINT, SQLSMALLINT, SQLINT, SQLBIGINT, SQLFLT4, SQLFLT8, SQLDATETIME, SQLDATETIM4, SQLDATETIM8, SQLMONEY, SQLMONEY4, SQLVARIANT, and SQLUNIQUEID

None.

NULLABLE

Variable Number

SQLDECIMAL and SQLNUMERIC

None.

NULLABLE, PRECISION, SCALE

LOB

SQLIMAGE, CharLOB, SQLTEXT, and SQLUDT

None.

NULLABLE

Character LOB

SQLNTEXT

None.

NULLABLE

Binary string

SQLBINARY and SQLVARYBIN

None.

NULLABLE, LENGTH

Character string

SQLCHAR, SQLVARYCHAR, SQLNCHAR, and SQLNVARCHAR

None.

NULLABLE, LENGTH

Important

To bulk export or import SQLXML data, use one of the following data types in your format file: SQLCHAR or SQLVARYCHAR (the data is sent in the client code page or in the code page implied by the collation), SQLNCHAR or SQLNVARCHAR (the data is sent as Unicode), or SQLBINARY or SQLVARYBIN (the data is sent without any conversion).

For more information about SQL Server data types, see Data Types (Transact-SQL).

How Bulk Import Uses the &lt;ROW&gt; Element

The <ROW> element is ignored in some contexts. Whether the <ROW> element affects a bulk-import operation depends on how the operation is performed:

  • the bcp command

    When data is loaded into a target table, bcp ignores the <ROW> component. Instead, bcp loads the data based on the column types of the target table.

  • Transact-SQL statements (BULK INSERT and OPENROWSET's Bulk rowset provider)

    When bulk importing data into a table, Transact-SQL statements use the <ROW> component to generate the input rowset. Also, Transact-SQL statements perform appropriate type conversions based on the column types specified under <ROW> and the corresponding column in the target table. If a mismatch exists between column types as specified in the format file and in the target table, an extra type conversion occurs. This extra type conversion may lead to some discrepancy (that is, a loss of precision) in behavior in BULK INSERT or OPENROWSET's Bulk rowset provider as compared to bcp.

    The information in the <ROW> element allows a row to be constructed without requiring any additional information. For this reason, you can generate a rowset using a SELECT statement (SELECT * FROM OPENROWSET(BULK datafile FORMATFILE=xmlformatfile).

    Note

    The OPENROWSET BULK clause requires a format file (note that converting from the data type of the field to the data type of a column is available only with an XML format file).

How Bulk Import Uses the &lt;COLUMN&gt; Element

For bulk importing data into a table, the <COLUMN> elements in a format file map a data-file field to table columns by specifying:

  • The position of each field within a row in the data file.

  • The column type, which is used to convert the field data type to the desired column data type.

If no column is mapped to a field, the field is not copied into the generated row(s). This behavior allows a data file to generate rows with different columns (in different tables).

Similarly, for bulk exporting data from a table, each <COLUMN> in the format file maps the column from the input table row to its corresponding field in the output data file.

Putting the xsi:type Value into a Data Set

When an XML document is validated through the XML Schema Definition (XSD) language, the xsi:type value is not put into the data set. However, you can put the xsi:type information into the data set by loading the XML format file into an XML document (for example, myDoc), as illustrated in the following code snippet:

...;
myDoc.LoadXml(xmlFormat);
XmlNodeList ColumnList = myDoc.GetElementsByTagName("COLUMN");
for(int i=0;i<ColumnList.Count;i++)
{
   Console.Write("COLUMN: xsi:type=" +ColumnList[i].Attributes["type",
      "http://www.w3.org/2001/XMLSchema-instance"].Value+"\n");
}