Data Spaces Structure for Rights-Managed Content

Article
08/15/2017

Rights-managed content can be stored in a structured storage compound file along with a description of the data transform (Rights Management) process. The process is described in a data spaces storage, a standard format used by various software components to interact with the transformed content. The Rights Management Add-on for Internet Explorer is one example of client software that uses a data space storage to display rights-managed content. This topic describes the data spaces storage format that contains the definition of the process used to transform content.

This topic contains the following sections.

Conventions
Data Spaces
Version Stream
Data Space Map Stream
Data Space Definition Storage
Transform Definition Storage
- Primary Stream
- Transform-Specific Data

Conventions

The following conventions are used in describing the data spaces storage.

All length-prefixed strings described in the following sections are padded to the nearest 4-byte multiple. For example, a wide-character string with three characters is six bytes long and must be followed by two bytes of padding. In sample code, where padding is required, a function named PaddingLength is called. The following example shows the implementation of PaddingLength used in the sample that accompanies this software development kit (SDK).

// This helper function is used for correctly aligning
// bytes to the correct boundaries.
long PaddingLength(long Length, int padsize)
{
        long padBytes = (Length % padsize);
        if (0 != padBytes)
        {
            padBytes = padsize - padBytes;
        }
        return padBytes;
}

Several storages and stream names include the string "\0x06" or "\0x09". These strings are not literally included in the compound file's title. Instead, they represent the ASCII characters with hexadecimal values 0x06 and 0x09.

Data Spaces

The transform process is defined in terms of data spaces. Each data space consists of one or more transforms that act on content in a content stream. The data spaces storage can describe any number of data spaces and associate them with content streams. The following diagram shows how data spaces are applied to content streams to transform them one or more times.

By storing transforms and data space inside a compound file along with transformed content, client software has all of the information necessary to read, write, or manipulate the content. A standard structure of streams and storages allows various software components to interact with the data in a consistent manner. The following diagram shows a representation of this standard structure, called the "data spaces storage."

The data spaces storage must be titled \0x06DataSpaces. It contains exactly two streams: one that describes the version of the data space format (the "version stream") and another that maps content streams to data space definitions (the "data space map stream"). The data spaces storage also contains exactly two substorages. These substorages contain definitions for data spaces and transforms called "data space definition storage" and "transform definition storage." The following table describes these four components.

Component	Name	Description
Version Stream	`Version`	Indicates the version of the data spaces storage format so that various software components can interact with the data it contains.
Data Space Map Stream	`DataSpaceMap`	Associates transformed content with data spaces.
Data Space Definition Storage	`DataSpaceInfo`	Contains definitions for data spaces that describe how content is transformed.
Transform Definition Storage	`TransformInfo`	Contains definitions for the transforms that make up the data spaces.

Version Stream

The data spaces storage contains data in a format that is accessible to any number of software components that can read, write, or update that data. A version stream indicates the version of this format to enable these software components to successfully interact with data in the data spaces storage. The version stream is always named Version. The version stream maintains the format version for three types of software components: readers, updaters, and writers.

Readers

A reader is any software component that extracts data from the data spaces storage, either to perform some operation on this data, or to display it to the user. A reader does not alter the data or the structure of the data spaces storage in any way. The Rights Management Add-on is an example of a reader. The reader version specifies the version of the data spaces format that a software component must be aware of to safely read the data.

A reader that interacts with any feature containing a version structure should comply with the following rules.

Must read feature data when the reader version is equal to the highest feature version understood by the software component.
Should read feature data when the reader version is less than the highest feature version understood by the software component. Any feature data that is specific to feature versions greater than the reader version should be ignored even if the reader understands the feature version for which that data applies.
Must not read feature data when the reader version is greater than the highest version understood by the software component.

Updaters

An updater is any software component that updates data in the data spaces storage. An updater does not alter the format of the data, only the data itself. The updater version specifies the version of the data spaces format that a software component must be aware of to safely update the data.

A updater that interacts with any feature containing a version structure should comply with the following rules.

Must preserve the format of the feature data when the updater version is less than or equal to the highest version understood by the software component.
Must not change the data when the updater version is greater than the highest version understood by the software component.

Writers

A writer is any software component that creates a data spaces storage and populates it with data. Microsoft Office 2003 and the sample included in this SDK are both examples of writers. The writer version is the version of the format generated by the last writer to create or modify the data spaces storage.

A writer that interacts with any feature containing a version structure should comply with the following rules.

Must set the writer version to no greater than the current feature version. This should be the highest version understood by the software component, and the feature data should not include any components from a higher version of the feature.
Must set the updater version to the lowest feature version an updater can use while preserving the format of the data.
Must set the reader version to the lowest feature version a reader can use to safely read the feature data. This cannot be greater than the updater version.

Version Stream Contents

The version stream describes the version of the data spaces format. However, similar format version information is applied to specific components of the data spaces storage. A feature identifier specifies which component the version information applies to. For this version stream, the feature identifier is Microsoft.Container.DataSpaces.

The following diagram shows the contents of the version stream. The name of each field is listed along with the size, in bytes, of that field. Where no size is specified, the field size depends on the value of another field.

Currently, the only available version of the data spaces format is 1.0. To generate a version stream, the major versions of the reader, updater, and writer should all be 1. The minor versions should be 0. The feature identifier is Microsoft.Container.DataSpaces. The following sample code shows how to write the version stream.

LPWSTR   FeatureIdentifier = L"Microsoft.Container.DataSpaces";
int      FeatureIdentifierLength = wcslen(FeatureIdentifier)*sizeof(WCHAR);
int      FeatureIdentifierPad = 
                    PaddingLength(FeatureIdentifierLength,sizeof(DWORD));
short    ReaderVersionMajor  = 1;
short    ReaderVersionMinor  = 0;
short    UpdaterVersionMajor = 1;
short    UpdaterVersionMinor = 0;
short    WriterVersionMajor  = 1;
short    WriterVersionMinor  = 0;

hResult = pDataSpaceStorage->CreateStream( StreamName_Version,  
                                           STGM_READWRITE | 
                                           STGM_SHARE_EXCLUSIVE,
                                           0,                     
                                           0,                     
                                           &pStream);               
CHECK_STG_ERROR("IStorage::CreateStream",hResult);
hResult = pStream->Write(&FeatureIdentifierLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(FeatureIdentifier, FeatureIdentifierLength, NULL);
if ((SUCCEEDED(hResult))&&(0<FeatureIdentifierPad)) 
    hResult = pStream->Write(PaddingBuffer, FeatureIdentifierPad, NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&ReaderVersionMajor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&ReaderVersionMinor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&UpdaterVersionMajor, sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&UpdaterVersionMinor, sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&WriterVersionMajor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&WriterVersionMinor,  sizeof(short), NULL);
CHECK_STG_ERROR("IStream::Write",hResult);
pStream->Release();

Data Space Map Stream

The data space map associates transformed content with the data space that describes the series of transforms applied to that data. By using a map to associate data spaces with content, a single data space can be used to transform more than one content stream. It is not valid, however, to have more than one data space apply to the same content stream. The data space map stream is always named DataSpaceMap. Two parts make up the data space map: the data space map header and a list of map entries.

The first portion is a header which describes the number of entries in the map. The header contains two values. The first value is the length of the header in bytes. The second value is the number of entries in the data space map. The following image shows the contents of the data space map header.

Each entry in the data space map contains five fields: the length of the entry, the number of components that make up the content stream reference, the content stream reference, the length of the data space name, and finally the name of data space used to transform that content. The location of a content stream is described in terms of a path—a route from the root storage to the substorage containing the content stream. The format used to specify this path is described below. The data space name is the name of any of the streams in the data space definition storage. The data space definition storage is described in a later section. The following image shows the contents of a data space map entry.

Each storage in the path of the content stream is a reference component. The name of the stream itself is a reference component. Reference components are always listed from the most general (storages) to the most specific (streams). For example, a stream titled "Chapter 1" in a substorage called "Book" off the root storage of a compound file would have two reference components: "Book" and "Chapter 1" in that order. The simplest content stream reference is one with a single component indicating the name of a stream in the root storage of the compound file.

Each reference component begins with a reference component type of either 0 to indicate a stream or 1 to indicate a storage. Data spaces cannot be applied to a substream reference. The reference component type is followed by the reference component length and the name of the component in a wide-character string. The following image shows the contents of a reference component for streams and storages.

The following sample code shows how to write a data space map stream with a single entry.

HeaderLen =          8;
EntryCount =         1;
RefComponentCount =  1;
RefComponentType =   0;
RefComponentLen =    (int)wcslen(StreamName_DRMViewerContent)*sizeof(WCHAR);
RefComponentPad =    PaddingLength(RefComponentLen,sizeof(DWORD));
RefComponent =       StreamName_DRMViewerContent;
DataSpaceNameLen =   (int)wcslen(StreamName_DRMDataSpace)*sizeof(WCHAR);
DataSpaceNamePad =   PaddingLength(DataSpaceNameLen,sizeof(DWORD));
DataSpaceName =      StreamName_DRMDataSpace;
EntryLength =        sizeof(DWORD)*5+RefComponentLen+RefComponentPad+
                     DataSpaceNameLen+DataSpaceNamePad;
TotalLength =        sizeof(DWORD)*2+EntryLength;

hResult = pDataSpaceStorage->CreateStream( StreamName_DataSpaceMap,    
                               STGM_READWRITE | STGM_SHARE_EXCLUSIVE, 
                               0,
                               0,
                               &pStream);
CHECK_STG_ERROR("IStorage::CreateStream",hResult);
hResult = pStream->Write(&HeaderLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&EntryCount, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&EntryLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&RefComponentCount, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&RefComponentType, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&RefComponentLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(RefComponent, RefComponentLen, NULL);
if ((SUCCEEDED(hResult))&&(0<RefComponentPad)) 
    hResult = pStream->Write(PaddingBuffer, RefComponentPad, NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&DataSpaceNameLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(DataSpaceName, DataSpaceNameLen, NULL);
if ((SUCCEEDED(hResult))&&(0<DataSpaceNamePad)) 
    hResult = pStream->Write(PaddingBuffer, DataSpaceNamePad, NULL);
CHECK_STG_ERROR("IStream::Write",hResult);
pStream->Release();

Data Space Definition Storage

Each data space describes a series of transforms that are applied to data in a content stream. You can have more than one data space if, for example, one content stream is both compressed and encrypted while a second stream is merely encrypted. Every data space is defined in its own stream in the data space definition storage. This storage is called DataSpaceInfo. Each stream can have any valid name as long as it corresponds to an entry in the data space map stream.

A data space definition stream contains a header with two fields: the header length and the number of transforms in the data space. Following the header are zero or more transform references. Each transform reference contains a length-prefixed transform name. Transform names correspond to substorage names in the transform definition storage. The following image shows the data that defines a data space.

If you are creating a rights-managed HTML file that contains only a single content stream, you only need to define a single data space. If your application also stores other transformed content in the same compound file, you might need to define more than one data space. The following sample shows how to create a data space definition stream for a data space with a single encryption transform.

HeaderLen            = sizeof(int)*2;
int TransformCount   = 1;
int TransformNameLen = wcslen(StorageName_DRMTransform)*sizeof(WCHAR);
int TransformNamePad = PaddingLength(TransformNameLen, sizeof(DWORD));
LPWSTR TransformName = StorageName_DRMTransform;

hResult = pDataSpaceInfoStorage->CreateStream( StreamName_DRMDataSpace,    
                               STGM_READWRITE | STGM_SHARE_EXCLUSIVE, 
                               0,                     
                               0,                     
                               &pStream);             
CHECK_STG_ERROR("IStorage::CreateStream",hResult);
hResult = pStream->Write(&HeaderLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&TransformCount,   sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&TransformNameLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) hResult = 
    pStream->Write(TransformName, TransformNameLen, NULL);
if ((SUCCEEDED(hResult))&&(0<TransformNamePad)) 
    hResult = pStream->Write(PaddingBuffer, TransformNamePad, NULL);
CHECK_STG_ERROR("IStream::Write",hResult);
pStream->Release();

Transform Definition Storage

The transform definition storage, named TransformInfo, is the final component of the data spaces storage. It contains zero or more definitions for the possible transforms that can be applied to the data in content streams. Every transform referenced from a data space must be defined in a substorage of the transform definition storage. These substorages are called transform storages. Transform storages can have any valid storage name. A transform definition identifies an algorithm used to transform data and any parameters needed by that algorithm. Transform storages do not contain actual implementations of transform algorithms, merely definitions. It is presumed that all software components that interact with the transformed content have access to an existing implementation of the transform.

The following image shows a representation of the transform definition storage.

While the transform definitions storage supports arbitrarily defined transforms, only two transforms are part of the rights-managed HTML and restricted permission message formats. These transforms are shown in the following table along with the feature identifier name and GUID.

Feature identifier	GUID	Description
`Microsoft.Metadata.CompressionTransform`	`{86DE7F2B-DDCE-486d-B016-405BBE82B8BC}`	Compresses data using GZIP compression.
`Microsoft.Metadata.DRMTransform`	`{C73DFACD-061F-43b0-8B64-0C620D2A8B50}`	Encrypts and protects data using Windows Rights Management (RM) services.

Note Content compressed with this compression transform must use a compression level of 9 and a window size of 32768 bytes.

The transform class and corresponding data is contained in a stream called \0x06Primary. Definition storages can also contain transform-specific streams or even substorages which hold additional data. The following sections describe the generic primary stream format and the specific format of an addition stream used by the encryption transform.

Primary Stream

The primary stream of every transform definition begins with a common header structure. The purpose of this header is to describe to the client application which transform is being defined. The remaining data in this stream is specific to the transform. Supported structures for both compression and encryption are discussed in detail in this section.

The common header consists of the header length, the transform type, the length of the transform class name, and the transform class name itself. The only supported value for transform type is 1. If other transform types are supported in the future, the structure of this header might change. The transform class name is a GUID that identifies the algorithm used to implement the transform. This does not imply that the class name corresponds to a registered Component Object Model (COM) object. It is merely a way of uniquely identifying each transform. The following image shows the structure of the transform header.

Although it is not a required component in all transform definitions, the compression and encryption transforms supported by this rights-managed content format both require a format versioning structure immediately after the transform header. The version structure is similar to the data space format versioning structure. It contains a feature identifier followed by the format version requirements for readers, updaters, and writers. The following image shows the structure of the transform format version information.

For the supported compression transform, the final field is a 4-byte transform instance header containing only its own length.

For encryption, the version structure is followed by a transform instance header and issuance license. The transform instance header is intended for future use. The current format contains a single field which is the length of the header itself: 4 bytes. Following the instance header is a length-prefixed issuance license in an array of single-byte characters. The issuance license is the signed issuance license obtained during content publishing using the RM SDK. The following image shows the structure of the transform instance header and the issuance license.

The following sample code shows how to create the primary stream for an encryption transform.

// Transform Header
LPWSTR TransformClassName     = L"{C73DFACD-061F-43B0-8B64-0C620D2A8B50}";
int TransformClassNameLength  = (int)wcslen(TransformClassName)*sizeof(WCHAR);
int TransformType             = 1;
int HeaderLen                 = sizeof(int)*3 + TransformClassNameLength;

// Transform Version
LPWSTR TransformIdentifier    = L"Microsoft.Metadata.DRMTransform";
int TransformIdentifierLength = (int)wcslen(TransformIdentifier)*sizeof(WCHAR);
int TransformIdentifierPad    = 
       PaddingLength(TransformIdentifierLength, sizeof(DWORD));

// Instance Header
int InstanceHeaderLength = sizeof(int);

// Issuance License
LPWSTR wszSignedIL = GetSignedIL();
int IssuanceLicenseLength = (int)wcslen(wszSignedIL);
int IssuanceLicensePad = PaddingLength(IssuanceLicenseLength,sizeof(DWORD));
LPSTR IssuanceLicense = (CHAR *)HeapAlloc(GetProcessHeap(), 0, IssuanceLicenseLength+1);
if ( NULL == IssuanceLicense )
{
    printf("Error (%s): E_OUTOFMEMORY\n", "HeapAlloc");
    goto e_Exit;
}
sprintf_s(IssuanceLicense, IssuanceLicenseLength+1, "%S",wszSignedIL);

hResult = pDRMTransformStorage->CreateStream( StreamName_Primary,      
                               STGM_READWRITE | STGM_SHARE_EXCLUSIVE, 
                               0,                     
                               0,                     
                               &pStream);             
CHECK_STG_ERROR("IStorage::CreateStream",hResult);
   
hResult = pStream->Write(&HeaderLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&TransformType, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&TransformClassNameLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(TransformClassName, 
                             TransformClassNameLength, 
                             NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&TransformIdentifierLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(TransformIdentifier, 
                             TransformIdentifierLength, 
                             NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(PaddingBuffer, TransformIdentifierPad, NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&ReaderVersionMajor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&ReaderVersionMinor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&UpdaterVersionMajor, sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&UpdaterVersionMinor, sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&WriterVersionMajor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&WriterVersionMinor,  sizeof(short), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&InstanceHeaderLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&IssuanceLicenseLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(IssuanceLicense, IssuanceLicenseLength, NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(PaddingBuffer, IssuanceLicensePad, NULL);
CHECK_STG_ERROR("IStream::Write",hResult);
pStream->Release();

Transform-Specific Data

The encryption transform supports caching end-user licenses in the document file. These licenses are stored in the transform-specific data streams in the definition storage. The end-user license is obtained as part of the licensing process when using the RM client SDK.

The title of each end-user license stream must begin with the prefix "EUL-". The rest of the title can be any valid name, provided that it is unique. A common practice is to use a base-32 encoded GUID to ensure that the stream name is unique.

The user name segment of the end-user license stream consists of the end-user provider name (currently either "Windows" or "Passport"), a colon delimiter (:), and the name of the user to whom the license is granted. This string must be encoded into 8-bit Unicode Transformation Format (UTF-8) and then the UTF-8 string must be encoded into base-64.

The end-user license stream begins with a header that includes the header length, the length of the base-64 encoded user name string, and the user name string encoded in base-64. Following the header is the length of the end-user license, and the license obtained from the RM client SDK. The following image shows the structure of the end-user license stream.

The following sample code shows how to create an end-user license stream for the definition of an encryption transform.

HeaderLen = sizeof(int)*2 + UserNameLen + UserNamePad;
hResult = pDRMTransformStorage->CreateStream( L"EUL-ZZZZZZZZZZZZZZZZZZZZZZZZZZ", 
                               STGM_READWRITE | STGM_SHARE_EXCLUSIVE, 
                               0,                     
                               0,                     
                               &pStream);             
CHECK_STG_ERROR("IStorage::CreateStream",hResult);
hResult = pStream->Write(&HeaderLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&UserNameLen, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(UserName,  UserNameLen, NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(PaddingBuffer,  UserNamePad, NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(&LicenseLength, sizeof(int), NULL);
if (SUCCEEDED(hResult)) 
    hResult = pStream->Write(License, LicenseLength, NULL);
CHECK_STG_ERROR("IStream::Write",hResult);
pStream->Release();