Office Open XML Formats: Setting Custom Word 2007 Document Properties

Office Visual How To

Applies to:  2007 Microsoft Office System, Microsoft Office Word 2007, Microsoft Visual Studio 2005

Ken Getz, MCW Technologies, LLC

March 2007

Overview

2007 Microsoft Office system documents allow you to set and retrieve custom document properties, so that you can store your own document metadata. In Microsoft Windows, you can use this metadata as a tool for searching and categorizing your documents. Imagine that you need to set custom document properties on documents stored on a server. One option is to load each document individually in Microsoft Office Word 2007, set the property, save the document, and move to the next document. Because of the new Office Open XML File Formats, you can also achieve the same goal programmatically, without loading each document into Word. This technique requires a measurable amount of program code, but the code is efficient and provides you with the best performance. Working with the Office Open XML File Formats requires knowledge of how Word stores the content, the System.IO.Packaging API, and XML programming.

See It Setting Custom Word 2007 Document Properties

Watch the Video

Length: 00:11:50 | Size: 8.7 MB | Type: WMV file

Code It | Read It | Explore It

Code It

To get started, download a set of forty code snippets for Microsoft Visual Studio 2005, each of which demonstrate various techniques working with the 2007 Office System Sample: Open XML File Format Code Snippets for Visual Studio 2005. After you install the code snippets, and have a sample Word document with which to test, you’re ready to go. For details, see Read It. Create a Windows Application project in Microsoft Visual Studio 2005, open the code editor, right-click, select Insert Snippet, and select the Word: Set Custom Property snippet from the list of available 2007 Office snippets. If you use Microsoft Visual Basic, inserting the snippet inserts a reference to WindowsBase.dll and adds the following Imports statements.

Imports System.IO
Imports System.IO.Packaging
Imports System.Xml

If you use Microsoft Visual C#, you need to add the reference to the WindowsBase.dll assembly and the corresponding using statements, so that you can compile the code. (Code snippets in C# cannot set references and insert using statements for you.) If the Windowsbase.dll reference does not appear on the .NET tab of the Add Reference dialog box, click the Browse tab, locate the C:\Program Files\Reference assemblies\Microsoft\Framework\v3.0 folder, and then click WindowsBase.dll.

The WDSetCustomProperty procedure, as it is designed in the snippet, takes several different actions:

  • If the custom.xml document part does not already exist, the code creates it. The remainder of the actions assume that custom.xml exists in the document.

  • If the requested property does not exist, the code creates it, using the specified type.

  • If the requested property does already exist, but its type does not match the type you specify in the call to WDSetCustomProperty, the code changes the type and sets the value.

  • If the type you specify matches the current type, the code simply modifies the value of the property.

The WDSetCustomProperty snippet delves programmatically into the various document parts and relationships between the parts to set a custom document property. To test it, call the snippet’s procedure like this (this sample creates a new custom property, modifies the type of the new custom property, and then finally modifies the value without changing the type):

' Create a custom property.
WDSetCustomProperty("C:\demo.docx", "Completed", False, _
 PropertyTypes.YesNo)

' Change an existing custom property's type.
WDSetCustomProperty("C:\demo.docx", "Completed", #1/1/2008#, _
 PropertyTypes.DateTime)

' Modify an existing custom property.
WDSetCustomProperty("C:\demo.docx", "Completed", #1/1/2009#, _
 PropertyTypes.DateTime)
WDSetCustomProperty("C:\\demo.docx", "Completed", 
  false, PropertyTypes.YesNo);

// Change an existing custom property's type.
WDSetCustomProperty("C:\\demo.docx", "Completed", 
  new DateTime(2008, 1, 1), PropertyTypes.DateTime);

// Modify an existing custom property.
WDSetCustomProperty("C:\\demo.docx", "Completed", 
  new DateTime(2009, 1, 1), PropertyTypes.DateTime);

The snippet code starts by defining an enumeration representing the available property types:

Public Enum PropertyTypes
  YesNo
  Text
  DateTime
  NumberInteger
  NumberDouble
End Enum
public enum PropertyTypes
{
  YesNo,
  Text,
  DateTime,
  NumberInteger,
  NumberDouble,
}

If you insert the snippet more than once, you find this enumeration inserted multiple times as well.

The code starts with the following block:

Public Function WDSetCustomProperty( _
 ByVal docName As String, _
 ByVal propertyName As String, _
 ByVal propertyValue As Object, _
 ByVal propertyType As PropertyTypes) _
 As Boolean

  Const documentRelationshipType As String = _
   "http://schemas.openxmlformats.org/" & _
   "officeDocument/2006/" & _
   "relationships/officeDocument"
  Const customPropertiesRelationshipType As String = _
   "http://schemas.openxmlformats.org/" & _
   "officeDocument/2006/" & _
   "relationships/custom-properties"
  Const customPropertiesSchema As String = _
   "http://schemas.openxmlformats.org/" & _
   "officeDocument/2006/custom-properties"
  Const customVTypesSchema As String = _
   "http://schemas.openxmlformats.org/" & _
   "officeDocument/2006/docPropsVTypes"

  Dim retVal As Boolean = False
  Dim documentPart As PackagePart = Nothing
  Dim propertyTypeName As String = "vt:lpwstr"
  Dim propertyValueString As String = Nothing

  ' Calculate the correct type.
  Select Case propertyType
    Case PropertyTypes.DateTime
      propertyTypeName = "vt:filetime"
      ' Make sure you were passed a real date, 
      ' and if so, format in the correct way. 
      ' The date/time value passed in should 
      ' represent a UTC date/time.
      If propertyValue.GetType() Is GetType(System.DateTime) Then
        propertyValueString = _
         String.Format("{0:s}Z", Convert.ToDateTime(propertyValue))
      End If
    Case PropertyTypes.NumberInteger
      propertyTypeName = "vt:i4"
      If propertyValue.GetType() Is GetType(System.Int32) Then
        propertyValueString = _
        Convert.ToInt32(propertyValue).ToString()
      End If

    Case PropertyTypes.NumberDouble
      propertyTypeName = "vt:r8"
      If propertyValue.GetType() Is _
       GetType(System.Double) Then
        propertyValueString = _
         Convert.ToDouble(propertyValue).ToString()
      End If

    Case PropertyTypes.Text
      propertyTypeName = "vt:lpwstr"
      propertyValueString = Convert.ToString(propertyValue)

    Case PropertyTypes.YesNo
      propertyTypeName = "vt:bool"
      If propertyValue.GetType() Is _
       GetType(System.Boolean) Then
        ' Must be lower case!
        propertyValueString = _
         Convert.ToBoolean(propertyValue).ToString().ToLower()
      End If
  End Select

  If propertyValueString Is Nothing Then
    ' If the code wasn't able to convert the 
    ' property to a valid value, 
    ' throw an exception:
    Throw New InvalidDataException("Invalid parameter value.")
  End If

  ' Next code block goes here.

  Return retVal
End Function
public bool WDSetCustomProperty(string docName, 
  string propertyName, object propertyValue, 
  PropertyTypes propertyType)
{
  const string documentRelationshipType =
    http://schemas.openxmlformats.org/officeDocument/ +
    "2006/relationships/officeDocument";
  const string customPropertiesRelationshipType =
    http://schemas.openxmlformats.org/officeDocument/ +
    "2006/relationships/custom-properties";
  const string customPropertiesSchema =
    http://schemas.openxmlformats.org/officeDocument/" + 
    "2006/custom-properties";
  const string customVTypesSchema =
    "http://schemas.openxmlformats.org/officeDocument/ + 
    "2006/docPropsVTypes";

  bool retVal = false;
  PackagePart documentPart = null;
  string propertyTypeName = "vt:lpwstr";
  string propertyValueString = null;

  //  Calculate the correct type.
  switch (propertyType)
  {
    case PropertyTypes.DateTime:
      propertyTypeName = "vt:filetime";
      //  Make sure you were passed a real date, 
      //  and if so, format in the correct way. The date/time 
      //  value passed in should represent a UTC date/time.
      if (propertyValue.GetType() == typeof(System.DateTime))
      {
        propertyValueString = string.Format("{0:s}Z",
          Convert.ToDateTime(propertyValue));
      }
      break;

    case PropertyTypes.NumberInteger:
      propertyTypeName = "vt:i4";
      if (propertyValue.GetType() == typeof(System.Int32))
      {
        propertyValueString =
          Convert.ToInt32(propertyValue).ToString();
      }
      break;

    case PropertyTypes.NumberDouble:
      propertyTypeName = "vt:r8";
      if (propertyValue.GetType() == typeof(System.Double))
      {
        propertyValueString =
          Convert.ToDouble(propertyValue).ToString();
      }
      break;

    case PropertyTypes.Text:
      propertyTypeName = "vt:lpwstr";
      propertyValueString = Convert.ToString(propertyValue);
      break;

    case PropertyTypes.YesNo:
      propertyTypeName = "vt:bool";
      if (propertyValue.GetType() == typeof(System.Boolean))
      {
        //  Must be lower case!
        propertyValueString =
          Convert.ToBoolean(propertyValue).ToString().ToLower();
      }
      break;
  }

  if (propertyValueString == null)
  {
    //  If the code cannot convert the 
    //  property to a valid value, throw an exception.
    throw new InvalidDataException("Invalid parameter value.");
  }

  // Next code block goes here.

  return retVal;
}

After declaring constants that the code needs to navigate the relationships between the document parts in a Word document, as well as constants defining the namespaces the code needs when searching for nodes in the XML content, the code declares a few variables it uses throughout the procedure, including the propertyTypeName variable. Assuming that most properties you set are strings, this variable is initialized to the value vt:lpwstr.

The code uses the property type you specified to set the propertyTypeName variable to one of the values Word can accept (vt:lpwstr for strings, vt:filetime for date/time values, vt:i4 for integer values, and vt:r8 for double-precision values) and converts the property value into a string for insertion into the XML. If the code could not convert the property you specified into one of the types it accepts, it throws an InvalidDataException error.

Next, the code includes this block:

Using wdPackage As Package = Package.Open(docName, _
  FileMode.Open, FileAccess.ReadWrite)

  ' Get the main document part (document.xml).
  For Each relationship As PackageRelationship In _
   wdPackage.GetRelationshipsByType(documentRelationshipType)

    Dim documentUri As Uri = _
     PackUriHelper.ResolvePartUri(New Uri("/", UriKind.Relative), _
     relationship.TargetUri)

    documentPart = wdPackage.GetPart(documentUri)
    ' There is only one document.
    Exit For
  Next

  ' Work with the custom properties part.
  Dim customPropsPart As PackagePart = Nothing

  ' Get the custom part (custom.xml). 
  ' It may not exist.
  For Each relationship As _
   PackageRelationship In wdPackage.GetRelationshipsByType( _
   customPropertiesRelationshipType)
    Dim documentUri As Uri = PackUriHelper.ResolvePartUri( _
     New Uri("/", UriKind.Relative), relationship.TargetUri)

    customPropsPart = _
     wdPackage.GetPart(documentUri)
    ' There is only one custom properties part, 
    ' if it exists at all.
    Exit For
  Next

  ' Manage namespaces to perform Xml 
  ' XPath queries.
  Dim nt As New NameTable()
  Dim nsManager As New XmlNamespaceManager(nt)
  nsManager.AddNamespace("d", customPropertiesSchema)
  nsManager.AddNamespace("vt", customVTypesSchema)

  Dim customPropsUri As New Uri("/docProps/custom.xml", _
   UriKind.Relative)
  Dim customPropsDoc As XmlDocument = Nothing
  Dim rootNode As XmlNode = Nothing

  ' Next code block goes here.

End Using
using (Package wdPackage = Package.Open(
  docName, FileMode.Open, FileAccess.ReadWrite))
{
  //  Get the main document part (document.xml).
  foreach (PackageRelationship relationship in
    wdPackage.GetRelationshipsByType(documentRelationshipType))
  {
    Uri documentUri = PackUriHelper.ResolvePartUri(
      new Uri("/", UriKind.Relative), relationship.TargetUri);
    documentPart = wdPackage.GetPart(documentUri);
    //  There is only one document.
    break;
  }

  //  Work with the custom properties part.
  PackagePart customPropsPart = null;

  //  Get the custom part (custom.xml). It may not exist.
  foreach (PackageRelationship relationship in
    wdPackage.GetRelationshipsByType(
    customPropertiesRelationshipType))
  {
    Uri documentUri = PackUriHelper.ResolvePartUri(
      new Uri("/", UriKind.Relative), relationship.TargetUri);
    customPropsPart = wdPackage.GetPart(documentUri);
    //  There is only one custom properties part, 
    // if it exists at all.
    break;
  }

  //  Manage namespaces to perform Xml XPath queries.
  NameTable nt = new NameTable();
  XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
  nsManager.AddNamespace("d", customPropertiesSchema);
  nsManager.AddNamespace("vt", customVTypesSchema);

  Uri customPropsUri = 
    new Uri("/docProps/custom.xml", UriKind.Relative);
  XmlDocument customPropsDoc = null;
  XmlNode rootNode = null;

  // Next code block goes here.

}

The code block finds the document part, by calling the Package.GetRelationshipsByType method, passing in the constant that contains the document relationship name (see Figure 3). The code then loops through all the returned relationships, and retrieves the document URI, relative to the root of the package. You must loop through the PackageRelationship objects to retrieve the one you want. In every case, this loop only executes once. Although it is useful to see how to retrieve a reference to the main document part, in this particular snippet, that code is vestigial, the snippet does not use that reference to do its work.

Next, the code uses the same technique to attempt retrieve a reference to the custom properties part, which may not exist. At this point, the customPropsPart reference might still be null. Finally, the block sets up an XmlNamespace manager, adding in namespace information for the XML content it queries later, and sets up a URI for the custom properties part.

The next block deals with creating the custom properties part, if it does not already exist, or loading its contents into an XmlDocument instance, if it does:

If customPropsPart Is Nothing Then
  customPropsDoc = New XmlDocument(nt)

  ' Part doesn't exist. Create it now.
  customPropsPart = wdPackage.CreatePart( _
    customPropsUri, "application/vnd.openxmlformats-officedocument.custom-properties+xml")

  ' Set up the rudimentary custom part.
  rootNode = customPropsDoc.CreateElement( _
   "Properties", customPropertiesSchema)
  rootNode.Attributes.Append(customPropsDoc. _
   CreateAttribute("xmlns:vt"))
  rootNode.Attributes("xmlns:vt").Value = customVTypesSchema

  customPropsDoc.AppendChild(rootNode)

  ' Create the document's relationship to _
  ' the new custom properties part:
  wdPackage.CreateRelationship(customPropsUri, _
   TargetMode.Internal, customPropertiesRelationshipType)
Else
  ' Load the contents of the custom 
  ' properties part into an XML document.
  customPropsDoc = New XmlDocument(nt)
  customPropsDoc.Load(customPropsPart.GetStream())

  rootNode = customPropsDoc.DocumentElement
End If

' Next block goes here.

' Save the properties XML back to its part.
customPropsDoc.Save(customPropsPart. _
 GetStream(FileMode.Create, FileAccess.Write))
if (customPropsPart == null)
{
  customPropsDoc = new XmlDocument(nt);

  //  The part does not exist. Create it now.
  customPropsPart = wdPackage.CreatePart(
    customPropsUri, "application/vnd.openxmlformats-officedocument.custom-properties+xml");

  //  Set up the rudimentary custom part.
  rootNode = customPropsDoc.
    CreateElement("Properties", customPropertiesSchema);
  rootNode.Attributes.Append(
    customPropsDoc.CreateAttribute("xmlns:vt"));
  rootNode.Attributes["xmlns:vt"].Value = customVTypesSchema;

  customPropsDoc.AppendChild(rootNode);

  //  Create the document's relationship to the 
  //  new custom properties part.
  wdPackage.CreateRelationship(customPropsUri, 
    TargetMode.Internal, customPropertiesRelationshipType);
}
else
{
  //  Load the contents of the custom properties part 
  //  into an XML document.
  customPropsDoc = new XmlDocument(nt);
  customPropsDoc.Load(customPropsPart.GetStream());
  rootNode = customPropsDoc.DocumentElement;
}

// Next code block goes here.

//  Save the properties XML back to its part.
customPropsDoc.Save(customPropsPart.
  GetStream(FileMode.Create, FileAccess.Write));

If the code determines that the custom properties part does not exist, it starts by creating an XmlDocument instance, using the name table the code created earlier. It calls the Package.CreatePart method to create a document part, using the appropriate content type (application/xml). The code sets up the minimum XML content for the part (see Figure 4 for the details), and finally, creates the root relationship (see Figure 3 to see the relationship it creates). If the document part did exist, the code simply loads its content into the XmlDocument instance, and sets the rootNode variable to refer to the document element of the document part’s content. The block ends by writing the custom part back out to the document stream, saving the changes you see made in later code blocks.

The next block starts working with the content of the custom properties part:

Dim searchString As String = String.Format( _
 "d:Properties/d:property[@name='{0}']", propertyName)
Dim node As XmlNode = customPropsDoc.SelectSingleNode( _
 searchString, nsManager)

Dim valueNode As XmlNode = Nothing

If node IsNot Nothing Then
  ' You found the node. Now check its type:
  If node.HasChildNodes Then
    valueNode = node.ChildNodes(0)
    If valueNode IsNot Nothing Then
      Dim typeName As String = valueNode.Name
      If propertyTypeName = typeName Then
        ' The types are the same. Simply 
        ' replace the value of the node:
        valueNode.InnerText = propertyValueString
        ' If the property existed, and 
        ' its type hasn't changed, you're done:
        retVal = True
      Else
        ' Types are different. Delete the node, and clear 
        ' the node variable:
        node.ParentNode.RemoveChild(node)
        node = Nothing
      End If
    End If
  End If
End If

' Next block goes here.
string searchString =
  string.Format("d:Properties/d:property[@name='{0}']",
  propertyName);
XmlNode node = customPropsDoc.SelectSingleNode(
  searchString, nsManager);

XmlNode valueNode = null;

if (node != null)
{
  //  You found the node. Now check its type.
  if (node.HasChildNodes)
  {
    valueNode = node.ChildNodes[0];
    if (valueNode != null)
    {
      string typeName = valueNode.Name;
      if (propertyTypeName == typeName)
      {
        //  The types are the same. 
        //  Replace the value of the node.
        valueNode.InnerText = propertyValueString;
        //  If the property existed, and its type
        //  has not changed, you are finished.
        retVal = true;
      }
      else
      {
        //  Types are different. Delete the node
        //  and clear the node variable.
        node.ParentNode.RemoveChild(node);
        node = null;
      }
    }
  }
}

// Next code block goes here.

This block starts by searching the custom property part’s XML content for the property you requested it to work with, and if it finds a match, checks the type of the property. If the types are the same, the code replaces the value. If the types are different, the code deletes the node, and sets the variable referring to the node to a null reference, so the next code block can create the node:

If node Is Nothing Then
  Dim pidValue As String = "2"

  Dim propertiesNode As XmlNode = customPropsDoc.DocumentElement
  If propertiesNode.HasChildNodes Then
    Dim lastNode As XmlNode = propertiesNode.LastChild
    If lastNode IsNot Nothing Then
      Dim pidAttr As XmlAttribute = lastNode.Attributes("pid")
      If Not pidAttr Is Nothing Then
        pidValue = pidAttr.Value
        Dim value As Integer
        If Integer.TryParse(pidValue, value) Then
          pidValue = Convert.ToString(value + 1)
        End If
      End If
    End If
  End If

  ' Next block goes here.

End If
if (node == null)
{
  string pidValue = "2";

  XmlNode propertiesNode = customPropsDoc.DocumentElement;
  if (propertiesNode.HasChildNodes)
  {
    XmlNode lastNode = propertiesNode.LastChild;
    if (lastNode != null)
    {
      XmlAttribute pidAttr = lastNode.Attributes["pid"];
      if (!(pidAttr == null))
      {
        pidValue = pidAttr.Value;
        //  Increment pidValue, so that the new property
        //  gets a pid value one higher. This value should be 
        //  numeric, but it never hurt so to confirm.
        int value = 0;
        if (int.TryParse(pidValue, out value))
        {
          pidValue = Convert.ToString(value + 1);
        }
      }
    }
  }

  // Next code block goes here.

}

This block of code only executes if the property node either does not exist, or was deleted because its type was incorrect. Each property in the custom property parts has an assigned id value, and the lowest-numbered value is 2. This block of code finds the highest value, and increments it to create an ID value for the new node.

The final block of code creates the property node:

node = customPropsDoc. _
 CreateElement("property", customPropertiesSchema)
node.Attributes.Append(customPropsDoc.CreateAttribute("name"))
node.Attributes("name").Value = propertyName

node.Attributes.Append( _
 customPropsDoc.CreateAttribute("fmtid"))
node.Attributes("fmtid").Value = _
 "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}"

node.Attributes.Append( _
 customPropsDoc.CreateAttribute("pid"))
node.Attributes("pid").Value = pidValue

valueNode = customPropsDoc.CreateElement( _
 propertyTypeName, customVTypesSchema)
valueNode.InnerText = propertyValueString
node.AppendChild(valueNode)
rootNode.AppendChild(node)
retVal = True
node = customPropsDoc.
  CreateElement("property", customPropertiesSchema);
node.Attributes.Append(customPropsDoc.CreateAttribute("name"));
node.Attributes["name"].Value = propertyName;

node.Attributes.Append(customPropsDoc.CreateAttribute("fmtid"));
node.Attributes["fmtid"].Value = 
  "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";

node.Attributes.Append(customPropsDoc.CreateAttribute("pid"));
node.Attributes["pid"].Value = pidValue;

valueNode = customPropsDoc.
  CreateElement(propertyTypeName, customVTypesSchema);
valueNode.InnerText = propertyValueString;
node.AppendChild(valueNode);
rootNode.AppendChild(node);
retVal = true;

This code creates the various bits and pieces required to create a property node in the custom properties part (Figure 4 shows how the finished XML content should look). After it completes, this block sets the return value to True.

Read It

To work with the custom properties directly, you must understand the file structure of a simple Word 2007 document. To do that, create a Word 2007 document:

  1. With your new document loaded, select the Developer tab on the Ribbon. (If you do not see the Developer tab, click the Office button, and then click Word Options. Select the Show Developer Tab in the Ribbon option to display the tab.)

  2. Select Document Panel, and in the Document Information Panel dialog box, click OK. This action displays a set of standard document properties at the top of your document.

  3. Click the Document Properties down-down arrow, and select Advanced Properties, as shown in Figure 1.

    Figure 1. Select Advanced Properties, to set a custom document property.

    Advanced document properties

  4. In the Document Properties dialog box, click the Custom tab, select one of the suggested custom properties (or add your own), select a data type, enter a value, and click Add. Figure 2 shows the dialog box before you click Add. Note that the only data types available to you are Text, Date, Number, and Yes or no.

  5. Add several properties of different types, if you like. Click OK when you finish to dismiss the dialog box.

    Figure 2. Add a custom property.

    Custom document properties tab

  6. Save the document in a convenient location, and quit Microsoft Word. (For the purposes of this discussion, I named my document C:\Demo.docx.)

To investigate the contents of the document, you can follow these steps:

  1. In Windows Explorer, rename the document Demo.docx.zip.

  2. Open the ZIP file, using either Window Explorer, or some ZIP-management application.

  3. View the _rels\.rels file, shown in Figure 3. This document contains information about the relationships between the parts in the document. Note the value for the custom.xml part, as highlighted in the figure—this information allows you to find the specific part you need.

    Figure 3. The .rels file contains references to each of the top-level document parts.

    XML code snippet doc props

  4. Open docProps\custom.xml, as shown in Figure 3. The highlighted element in the figure contains the name, type, and value for the custom property. The snippet you investigate either creates this element, modifies its value if the value already exists, or changes the value and type if it already exists but its type does not match the new settings.

    Figure 4. The custom.xml part contains the custom properties.

    XML code snippet custom xml part

  5. Close the tool you used to investigate the presentation, and rename the file with a .DOCX extension.

Explore It