Chapter 4: Storage

 

Introduction
Chapter 1: The "Longhorn" Application Model
Chapter 2: Building a "Longhorn" Application
Chapter 3: Controls and XAML

Chapter 4: Storage

Brent Rector
Wise Owl Consulting

January 2004

UPDATE: In spite of what may be stated in this content, "WinFS" is not a feature that will come with the Longhorn Operating System. However, "WinFS" will be available on the Windows platform at some future date, which is why this article continues to be provided for your information.

Contents

What Is WinFS?
WinFS Programming Model
Using the WinFS API and SQL
Summary

In some ways, personal computer is an inadequate name. Most people don't use a personal computer to compute. They use a computer to communicate (through e-mail or instant messaging) and to store and organize their personal data (such as e-mail, documents, pictures, and digital music). Unfortunately, while your computer presently stores this data quite well, it does a relatively poor job of allowing you to organize the information so that you can find it later.

Disk capacity has been growing at roughly 70 percent annually over the last decade. It's presently possible to buy drives with more than 250 gigabytes (GB) of storage. It's likely that 500-GB drives will become available in the next few years and that many systems will have more than one disk drive. I just did a quick check on the computer on which I'm writing this chapter, and I have 283,667 files in 114,129 folders in only 200 GB of disk space. When I forget exactly where I put a file, it can take quite a while to find it again. In the worst case, I have to search the entire contents of each disk. In a few years, people will be able to store millions of files, most of which, if nothing improves, they'll never see again.

One reason people have difficulty finding information on their computer is because of the limited ability for the user to organize data. The present file system support for folders and files worked well originally because it was a familiar paradigm to most people and the number of files was relatively small. However, it doesn't easily allow you to store an image of your coworker Bob playing softball at the 2007 company picnic at a local park and later find the image when searching for documents that:

  • Mention Bob
  • Involve sports
  • Relate to company events
  • Pertain to the park or its surrounding area
  • Were created in 2007

The hierarchical folder structure doesn't work well when you want to categorize data in numerous ways. Therefore, we have a problem today in that we have lots of stuff to store and no good way to categorize it. In addition to categorizing information, which many people associate with attaching a fixed set of keywords to data, people need to relate data. For example, I might want to relate a picture to the company picnic, or I might want to relate a picture to Bob, who is also a member of an organization to which I donate time and effort, as a contact.

Another problem is that we store the same stuff in multiple places in multiple formats. Developers spend much time and effort creating their own, unique storage abstractions for everyday information such as People, Places, Times, and Events. For example, Microsoft Outlook has a definition of a Contact. The Microsoft Windows Address Book also has its own definition of a contact. Each instant messaging application has yet another. Each application stores its definition of a contact in a unique, isolated silo of information.

There are a number of problems with current approaches to data storage, including the following:

  • Developers reinvent the basic data abstractions repeatedly.
  • Multiple applications cannot easily share common data.
  • The same information lives in multiple locations.
  • The user repeatedly enters the same information.
  • Separate copies of data become unsynchronized.
  • There are no notifications of data change.

What Is WinFS?

WinFS is the new storage system in Longhorn. It improves the Microsoft Windows platform in three ways. First, it allows you to categorize your information in multiple ways and relate one item of information to another. Second, it provides a common storage format for information collected on an everyday basis, such as information dealing with people, places, images, and more. Third, it promotes data sharing of common information across multiple applications from multiple vendors.

WinFS Is a Storage Platform

WinFS is an active storage platform for organizing, searching for, and sharing all kinds of information. This platform defines a rich data model that allows you to use and define rich data types that the storage platform can use. WinFS contains numerous schemas that describe real entities such as Images, Documents, People, Places, Events, Tasks, and Messages. These entities can be quite complex. For example, a person can have multiple names, multiple physical and e-mail addresses, a current location, and much more.

Independent software vendors (ISVs) can also define their own new data types and provide their schema to WinFS. By allowing WinFS to manage complex storage problems, an ISV can concentrate on developing its unique application logic and leverage the richer storage facilities of WinFS for its everyday and custom data.

WinFS contains a relational engine that allows you to locate instances of storage types by using powerful, relational queries. WinFS allows you to combine these storage entities in meaningful ways using relationships. One contact can be a member of the Employee group of an Organization while concurrently a member of the Household group for a specific address. ISVs automatically gain the ability to search, replicate, secure, and establish relationships among their unique data types as well as among the predefined Windows data types.

This structure allows the user to pose questions to the system and ask it to locate information rather than asking the system to individually search folders. For example, you can ask WinFS to find all e-mail messages from people on your instant messenger buddy list for which you don't have a phone number. Using relational queries, you can find all members of a Household for a particular employee with a birthday in the current month.

WinFS also supports multiple flexible programming models that allow you to choose the appropriate application programming interface (API) for the task. You can access the store by using traditional relational queries using structured query language (SQL). Alternatively, you can use .NET classes and objects to access the data store. You can also use XML-based APIs on the data store. WinFS also supports data access through the traditional Microsoft Win32 file system API. You can even mix and match—that is, use multiple APIs for a single task. However, for most purposes, developers will use the managed class APIs to change data in the WinFS store. It will often be far more complex to make an update using raw SQL statements as compared to using the object APIs.

In addition, WinFS provides a set of data services for monitoring, managing, and manipulating your data. You can register to receive events when particular data items change. You can schedule WinFS to replicate your data to other systems.

WinFS Is a File System

For traditional file-based data, such as text documents, audio tracks, and video clips, WinFS is the new Windows file system. Typically, you will store the main data of a file, the file stream, as a file on an NTFS volume. However, whenever you call an API that changes or adds items with NTFS file stream parts, WinFS extracts the metadata from the stream and adds the metadata to the WinFS store. This metadata describes information about the stream, such as its path, plus any information that WinFS can extract from the stream. Depending on file contents, this metadata can be the author (of a document), the genre (of an audio file), keywords (from a PDF file), and more. WinFS synchronizes the NTFS-resident file stream and the WinFS-resident metadata. New Longhorn applications can also choose to store their file streams directly in WinFS. File streams can be accessed using the existing Win32 file system API or the new WinFS API.

WinFS Isn't Just a File System

A file system manages files and folders. While WinFS does manage files and folders, it also manages all types of nonfile-based data, such as personal contacts, event calendars, tasks, and e-mail messages. WinFS data can be structured, semistructured, or unstructured. Structured data includes a schema that additionally defines what the data is for and how you should use it. Because WinFS is, in part, a relational system, it enforces data integrity with respect to semantics, transactions, and constraints.

WinFS isn't just a relational system, either. It supports both hierarchical storage and relational storage. It supports returning data as structured types and as objects—types plus behavior. You might consider WinFS a hierarchical, relational, object-oriented data storage system—although it actually contains certain aspects of each of those traditional storage systems. WinFS extends beyond the traditional file system and relational database system. It is the store for all types of data on the newest Windows platform.

WinFS and NTFS

You can store a file either in the traditional NTFS file system or in the new WinFS data store just like you can store things in FAT32 or on CD-ROMs or in NTFS today. Normally, a file stored in NTFS is not visible in WinFS. Longhorn applications using the new WinFS APIs can access data stored either in WinFS or in NTFS. In addition, Longhorn applications can continue to use the Win32 API to access data stored in the NTFS file system.

File Promotion

Files are either in WinFS or not. Any item that has a file stream part can participate in promotion/demotion, which we more generally call metadata handling. When WinFS promotes a file, it extracts the metadata from the known NTFS file content and adds the metadata to the WinFS data store. The actual data stream of the file remains in the NTFS file system. You can then query WinFS regarding the metadata as if the file natively resides within WinFS. WinFS also detects changes in the NTFS file and updates the metadata within the WinFS data store as necessary.

File Import and Export

You can also import a file to WinFS from NTFS and export a file from WinFS to NTFS. Importing and exporting a file moves both the file content and the metadata. After importing or exporting, the new file is completely independent of the original file.

WinFS Programming Model

The WinFS programming model includes data access, data manipulation, WinFS data class extensibility, data synchronization, data change notifications, and event prioritization. Data access and data manipulation allow you to create, retrieve, update, and delete data stored within WinFS and to exercise domain-specific behaviors. Data class extensibility enables you to extend WinFS schemas with custom fields and custom types. Data synchronization allows you to synchronize data between WinFS stores and between a WinFS and a non-WinFS store.

The top of the WinFS data model hierarchy is a WinFS service, which is simply an instance of WinFS. One level in the hierarchy from the service is a volume. A volume is the largest autonomous container of items. Each WinFS instance contains one or more volumes. Within a volume are items.

WinFS introduces the item as the new unit of consistency and operation, rather than the file. The storage system stores items. You have rich query ability over items. An item is effectively a base type of the storage system. An item therefore has a set of data attributes and provides a basic query capability.

People typically organize data in the real world according to some system that makes sense in a given domain. All such systems partition data into named groups. WinFS models this notion with the concept of a folder. A folder is a special type of item. There are two types of folders: containment folders and virtual folders.

A containment folder is an item that contains holding links to other items and models the common concept of a file system folder. An item exists as long as at least one holding link references it. Note that a containment folder doesn't directly contain the items logically present in the folder but instead contains links to those items. This allows multiple containment folders to contain the same item.

A virtual folder is a dynamic collection of items. It is a named set of items. You can either enumerate the set explicitly or specify a query that returns the members of the set. A virtual folder specified by a query is quite interesting. When you add a new item to the store that meets the criteria of the query for a virtual folder, the new item is automatically a member of the virtual folder. A virtual folder is itself an item. Conceptually, it represents a set of nonholding links to items, as you can see in Figure 4-1.

Figure 4-1. The WinFS data model hierarchy

Sometimes, you need to model a highly constrained notion of containment—for example, a Microsoft Word document embedded in an e-mail message is, in a sense, bound more tightly to its container than, for example, a file contained within a folder. WinFS expresses this notion by using embedded items. An embedded item is a special kind of link within an item (named Embedded Link) that references another item. The referenced item can be bound to or otherwise manipulated only within the context of the containing item.

Finally, WinFS provides the notion of categories as a way to classify items. You can associate one or more categories with every item in WinFS. WinFS, in effect, tags the category name onto the item. You can then specify the category name in searches. The WinFS data model allows the definition of a hierarchy of categories, thus enabling a tree-like classification of data.

Organizing Information

All these features together allow five ways to organize your information in WinFS:

  • Hierarchical folder-based organization. With this approach, you still have the traditional hierarchical folder and item organization structure. All items in a WinFS data store must reside in a container, and one of these container types is a folder.
  • Type-based organization. An item is always of a particular type. For example, you have Person items, Photo items, Organization items, and many other available types. You can even create new types and store them in the WinFS data store.
  • Item property–based organization. You can view items that have one or more properties set to specified values. This is, in effect, a virtual folder view with a query that returns the items with the specified value for the specified properties.
  • Relationship-based organization. You can retrieve items based on their relationship to other items—for example, a Person can be a member of an Organization, and either one can be organized or searched for in terms of this relationship.
  • Category-based organization. You can create and associate any number of user-defined keywords with an item. Subsequently you can retrieve the items that have a specific value for an associated keyword. You won't, however, be able to create categorization taxonomies, so this organization technique is not as powerful as the preceding approaches.

WinFS APIs

WinFS provides three data access APIs: the managed WinFS API, the ADO.NET API, and the Win32 API. The WinFS API is a strongly typed "high level" API. ADO.NET provides a lower level API for working with data as XML or as tables or rows. Using ADO.NET, you can access data stored in WinFS by using Transact-Structured Query Language (T-SQL) and, when you want, retrieve data in XML using the T-SQL's FOR XML capability. The Win32 API allows access to the files and folders stored in WinFS.

You might prefer to use multiple access patterns to solve a problem. For example, you can issue a T-SQL query that returns a set of contacts as managed objects of the WinFS Contact type. Regardless of the API you use, each API ultimately manipulates data in the WinFS store using T-SQL.

In many cases, you will prefer to use the managed WinFS API. These .NET Framework classes automatically perform the object-relationship mapping needed to translate between object-oriented programming constructs, and they perform the necessary T-SQL to achieve the WinFS data access.

Using the Managed WinFS Classes

The WinFS managed classes reside in the System.Storage namespace and its nested namespaces. Many applications will also use WinFS type definitions from the System.Storage.Core namespace. You can additionally use types from more specialized namespaces. For example, the managed classes that manipulate the system definition of a Contact reside in the System.Storage.Contact namespace. For simplicity, all the code examples in this chapter will use the following set of using declarations:

using System.Storage;
using System.Storage.Core;
using System.Storage.Contact;

ItemContext

The WinFS store consists of items organized into folders and categorized. The first step in working with WinFS is to identify the set of items with which you want to work. We call this process binding, and the set of items can be any of the following:

  • An entire volume (also known as the root folder)
  • An identifiable subset of items in a given volume—for example, a particular containment folder or virtual folder
  • An individual item
  • A WinFS share (which identifies a volume, a folder, a virtual folder, or an individual item)

To bind to a set of items, you create a System.Storage.ItemContext object and connect it to a WinFS data store. Use the static System.Storage.ItemContext.Open helper method to create an ItemContext object.

The following code creates an ItemContext that connects to the default local WinFS volume. The default is the \\local-computer-name\DefaultStore share:

System.Storage.ItemContext ctx = System.Storage.ItemContext.Open ();
§
ctx.Close();

Alternatively, you can pass a string to the constructor to connect the item context to a specific WinFS store. The following code creates an item context connected to a WinFS share identified by the \\machine\Legal Documents share:

ItemContext ctx = null;
try {
ctx = ItemContext.Open (@"\machine\Legal Documents");
  §
}
finally {
  if (ctx != null) ctx.Dispose();
}

Be sure to close or dispose of the context object as soon as you finish using it regardless of exceptions. An ItemContext uses significant unmanaged resources—such as a connection to the store—that you should free up in a timely manner. To make closing contexts as convenient as possible, the ItemContext class implements the IDisposable interface. Therefore, you can use the C# using statement as shown in the following example to release these resources:

using (ItemContext ctx = ItemContext.Open (@"D:\MyStore")) {
§
}

Storing a New Item in a WinFS Data Store

Every item in a WinFS data store must be a member of a folder of the store. You obtain the root of the folder hierarchy by calling the extremely well-named static method System.Storage.Folder.GetRootFolder. However, there are also several system-defined containers for storing application-specific data. You often use one of the static methods on the UserDataFolder class to retrieve a folder in which you then place new items.

Getting a Folder

In the following example, I'll find the current user's Personal Contacts folder if it exists and create it when it doesn't exist. Note that this is a somewhat contrived example—the system automatically creates a user's Personal Contacts folder if it doesn't exist when the user first logs into a system—but it gives me a chance to show how to create an expected folder when it doesn't exist.

ItemContext ctx = ItemContext.Open ();
WellKnownFolder contactsFolder =
          UserDataFolder.FindUsersWellKnownFolderWithType (ctx,
                         GeneralCategories.PersonalContactsFolder);

if (contactsFolder == null) {
    //create the Personal Contacts folder
    Folder userDataFolder = UserDataFolder.FindMyUserDataFolder (ctx);
    WellKnownFolder subFolder = new WellKnownFolder (ctx);
    CategoryRef category = new CategoryRef (ctx,
                            GeneralCategories.PersonalContactsFolder);

    // Associate the PersonalContactsFolder category to the folder
    subFolder.FolderType = category;
    userDataFolder.AddMember (subFolder);
    ctx.Update();
}

The preceding code does a number of interesting things. First, I try to locate an existing folder contained in the user's personal data folder hierarchy. I'm not looking for the folder by a well-known name. Instead, I'm locating the folder within the user's personal data tree that has previously been associated with the well-known category PersonalContactsFolder. The shell displays this folder when you select My Contacts.

This folder normally already exists, but when it doesn't, I retrieve the root folder for the user's data hierarchy. I create a new item, of type WellKnownFolder, and then create a reference to a well-known category—the PersonalContactsFolder category. I then set the type of the new folder to the PersonalContactsFolder category type, and finally, I add the new folder to its containing folder—the user's personal data root folder. WinFS doesn't save any changes to the data store until you call Update on the item context (which I regularly forget to do).

Of course, this is the verbose way to find the Personal Contacts folder. I wanted to show you how things work. Normally, I'd use the following code instead. The FindMyPersonalContactsFolder method finds the existing folder.

WellKnownFolder userDataFolder =
         UserDataFolder.FindMyPersonalContactsFolder (ctx);

Creating a New Item

As I now have the Personal Contacts folder, it seems appropriate to create a new contact in the folder. In the following example, I'll create a number of Person contacts and add them to the folder:

Person[] CreateFriends (ItemContext ctx) {
  string[] GivenNames = { "Monica", "Rachel", "Chandler",
                          "Joey",   "Phoebe", "Ross"};
  string[] SurNames = { "Uchra",    "Emerald",  "Ranier",
                         "Fibonacci", "Smorgasbord", "Uchra"};
  Person[] Friends = new Person [GivenNames.Length];
  
  for (int index = 0; index < GivenNames.Length; index++) {
    string linkName = GivenNames[index] + " " + SurNames[index];
    Person p = Person.CreatePersonalContact (ctx, linkName);
    Friends[index] = p;

    p.DisplayName = linkName;
    FullName fn = p.GetPrimaryName ();
    fn.GivenName = GivenNames[index];
    fn.Surname = SurNames[index];
  }
  ctx.Update ();
}

The prior code uses the static Person.CreatePersonalContact method. This method

  • Creates a new Person item in the specified item context
  • Creates a new FolderMember relationship with the specified name that references the Person
  • Adds the FolderMember relationship to the PersonalContactsFolder's Relationship collection

I subsequently update the DisplayName, GivenName, and Surname properties of the Person item. As always, I call Update on the item context to save the changes to the data store.

Let's look more closely at the CreatePersonalContact method. It is equivalent to the following:

// Find the PersonalContacts folder
WellKnownFolder contactsFolder =
           UserDataFolder.FindUsersWellKnownFolderWithType (ctx,
                             GeneralCategories.PersonalContactsFolder);
// Create a new Person item
Person p = new Person (ctx);

// Need a folder relationship that references the new Person
FolderMember fm = new FolderMember (p, linkName);
folder.Relationships.Add (fm);
ctx.Update ();

Relationship Items

WinFS defines a relationship data model that allows you to relate items to one another. When you define the schema for a data type, you can define zero or more relationships as part of the schema. For example, the Folder schema defines the FolderMember relationship. The Organization schema defines the Employee relationship. For each such defined relationship, there is a class that represents the relationship itself. This class is derived from the Relationship class and contains members specific to the relationship type. There is also a strongly typed "virtual" collection class. This class is derived from VirtualRelationshipCollection and allows relationship instances to be created and deleted.

A relationship relates a source item to a target item. In the previous example, the Personal Contacts folder was the source item and the Person item was the target item. The FolderMember relationship basically indicates that the Person item relates to the Personal Contacts folder as a member of the folder.

When you define a relationship, you define whether the relationship keeps the target item in existence—a holding relationship—or doesn't keep the target item in existence—a reference relationship. When you create a holding relationship to a target item, WinFS increments a reference count on the target item. When WinFS deletes a holding relationship it decrements the reference count on the target item. An item no longer exists in the store when its reference count reaches zero. WinFS never alters the reference count of the target when you create or destroy a reference relationship to the target. Therefore, the target item can disappear from the store when its reference count reaches zero and the relationship might refer to a no-longer-existing item.

WinFS defines the FolderMember relationship as a holding relationship. Most other relationship classes are reference relationships.

Folder Items

Now that you know about Link items, I can refine my description of Folder items. A Folder is a WinFS item that has a collection of Link items. The target of each Link item in the collection is a member of the folder. The Folder.Members property represents this collection of links.

Note this gives a WinFS folder much greater flexibility than traditional file system folders. The members of a folder can be file and nonfile items. Multiple links to a particular item can reside in many folders concurrently. In other words, multiple folders can contain the same item.

Other Item Types

Generally, you create other item types in the WinFS store as you did in the previous examples. Each type occasionally has its own special usage pattern. For example, we can have organizations as members of our Personal Contacts folder, so let's create one:

Organization cp = FindOrCreateOrganization (ctx, "Main Benefit");
§
Organization FindOrCreateOrganization (ItemContext ctx, string orgName) {
  Organization o =
    Organization.FindOne (ctx, "DisplayName='" + orgName + "'");
  if (o == null) {
    Folder Pcf = UserDataFolder.FindMyPersonalContactsFolder (ctx);

    o = new Organization (ctx);
    o.DisplayName = orgName;

    Folder pcf = UserDataFolder.FindMyPersonalContactsFolder (ctx);

    pcf.AddMember (o, o.DisplayName.ToString ());
    ctx.Update ();
  }
  return o;
}

Now let's add an employee to that organization:

enum Names { Monica, Rachel, Chandler, Joey, Phoebe, Ross }
§
Person[] Friends = CreateFriends (ctx);
Organization cp = FindOrCreateOrganization (ctx, "Main Benefit");
AddEmployeeToOrganization (ctx, Friends [(int)Names.Rachel],
  cp);
§
void AddEmployeeToOrganization (ItemContext ctx, Person p, Organization o) {
  EmployeeData ed = new EmployeeData (ctx);

  ed.Name = p.DisplayName;
  ed.Target_Key = p.ItemID_Key;
  o.Employees.Add (ed);
  ctx.Update ();
}

Similarly, we can create households in our Personal Contacts folders. Note that a household doesn't imply a family. A household might be a group of roommates. WinFS has additional schema for families, but I'll leave that as an exercise for the reader.

CreateHousehold (ctx, Friends [(int) Names.Chandler],
                      Friends [(int) Names.Joey]);
CreateHousehold (ctx, Friends [(int) Names.Monica],
                      Friends [(int) Names.Rachel]);
§
void CreateHousehold (ItemContext ctx, Person p1, Person p2) {
  Household h = new Household (ctx);
  h.DisplayName = p1.GetPrimaryName().GivenName + " and " +
                  p2.GetPrimaryName().GivenName + " household";

  Folder pcf = UserDataFolder.FindMyPersonalContactsFolder (ctx);
  pcf.AddMember (h, h.DisplayName.ToString ());

  // Add first person to the household
  HouseholdMemberData hhmd = new HouseholdMemberData (ctx);
  hhmd.Name = p1.DisplayName;
  hhmd.Target_Key = p1.ItemID_Key;
  h.HouseholdMembers.Add (hhmd);

  // Add second person to the household
  hhmd = new HouseholdMemberData (ctx);
  hhmd.Name = p2.DisplayName;
  hhmd.Target_Key = p2.ItemID_Key;
  h.HouseholdMembers.Add (hhmd);
}

The prior example uses one concept I've not yet discussed. Note the use of the ItemID_Key property in this line of code:

  hhmd.Target_Key = p1.ItemID_Key;

Basically, the ItemID_Key value is another way to reference an item in the WinFS store, so let's look at the ways to find items in the store.

How to Find Items

Of course, it doesn't do much good to place items in a data store if you cannot subsequently find them easily. The ItemContext class contains instance methods you can use to retrieve items in a WinFS data store. You specify what type of item to find and any special constraints that the returned items must meet. In addition, each item class—for example, Person, File, Folder, and so forth—also contains static methods that allow you to find items of that particular type.

The FindAll method returns one or more items that match the specified criteria. The ItemContext.FindAll instance method requires you to specify the type of the items to locate. In addition, you can optionally specify search criteria to narrow the scope of search. For example, the following code finds all the Person items that have a DisplayName property whose value begins with "Brent".

FindResult res = ctx.FindAll (typeof(Person), "DisplayName='Brent%'");
foreach (Person p in res) {
    // Use the Person item somehow
}

Alternatively, I could use the static FindAll method of the Person class like this:

FindResult res = Person.FindAll (ctx, "DisplayName='Brent%'");
foreach (Person p in res) {
    // Use the Person item somehow
}

In both of these examples, the FindAll method always returns a collection of the items matching the type and specified criteria. This collection might contain no items, but you don't receive a null reference for the FindResult. Therefore, always iterate over the collection to obtain the items found.

When you know that only a single item will match the type requested and specified filter criteria, you can use the FindOne method. Be careful, however—the FindOne method throws an exception when it finds more than one item that matches your request.

Person p = Person.FindOne (ctx, "DisplayName='Brent Rector'");

The second string parameter is a filter expression that allows you to specify additional constraints the returned items must satisfy. The basic format of the filter expression is a string in the form "<propertyName> <operator> <propertyValue>".

WinFS calls the expression an OPath expression. The syntax is similar, although not identical, to the XPath expression syntax used for identifying items in an XML document. This code fragment returns all File items for files with either a "doc" or a "txt" file extension:

FindResult Files = File.FindAll (ctx, "Extension='doc' || Extension='txt'");

These expressions can be quite complex. For example, the following statement returns all Person items that represent employees of an employer with the DisplayName of "Main Benefit":

string pattern = "Source(EmployeeOf).DisplayName='Main Benefit'";
FindResult result = Person.FindAll (ctx, pattern);

Here's another one. I want the Person items where the Surname is not "Ranier" and the e-mail addresses don't end with ".edu".

string filter = "PersonalNames[Surname!='Ranier'] &&
                 !(PersonalEmailAddresses[Address like '%.edu'])");
FindResult result = Person.FindAll (ctx, filter);

Identifying a Specific Item

You frequently need to create references to items in the WinFS store. Eventually, you use these references to locate the appropriate item. Earlier in this chapter, I showed you how to use a link to reference an item. Links use a friendly string-based identity for the reference, and this string name must be unique within the link's containing folder. In other words, you need both the folder and one of its contained links to identify the referenced item.

However, you can create multiple links with the same friendly string name as long as you add the links to different folders so that all names within a single folder remain unique. Note that these multiple links with the same friendly text name don't actually have to reference the same target item. They could, but they don't have to.

In such cases, searching for all links with a specific friendly text name (using FindAll, for example) will return multiple results. You will then need to examine the source of each link to determine the containing folder, and then determine which link references the desired item.

We need a way to reference any arbitrary item in the store—for example, suppose I want the 3,287th item in the store. Fortunately, you can do exactly this.

Finding an Item by ItemID_Key Value

WinFS assigns each newly created item a GUID-based identification number, known as its ItemID_Key property. In practice, an ItemID_Key value is highly likely to be unique across all WinFS volumes; however, WinFS still treats this identifier as if it's unique only within a volume. You can use this volume unique value to identify any item in a WinFS volume.

Item GetItem (ItemContext ctx, SqlBinary itemID_Key) {
   // Convert itemID_Key to a string for use in the OPath filter 
   string hexItemID_Key = BitConverter.ToString (itemID_Key.Value);
   hexItemID_Key = "'0x" + hexItemID_Key.Replace ("-", String.Empty) + "'";

   // Build an opath filter expression.
   string query = "ItemID_Key=" + hexItemID_Key;

   return Item.FindOne (ctx, query);
}

Common Features

WinFS API provides several features across the entire spectrum of data classes. These features are

  • Asynchrony
  • Transactions
  • Notifications
  • Blob/stream support
  • Cursoring and paging

Asynchrony

The WinFS API allows you to run queries asynchronously. The WinFS API uses the .NET standard asynchronous programming model patterns.

Transactions

The WinFS store is a transactional store. WinFS, therefore, allows you to make transactional updates to the store using the BeginTransaction, CommitTransaction, and AbortTransaction methods on the ItemContext object, as shown in the following example:

using (ItemContext ctx = ItemContext.Open()) {
  using (Transaction t = ctx.BeingTransaction()) {
    Person p = Person.FindOne (ctx,
        "PersonalNames[GivenName='Chandler' And SurName='Bing']" );
    Household h = Household.FindOne (ctx,
        "DisplayName = 'Chandler and Joey Household'");
    p.PersonalEAddresses.Add (new TelephoneNumber ("202", "555-1234"));
    p.Save ();
    h.Members.Add (p);
    h.Save ();
    t.Commit ();
  }
}

Notifications

The WinFS Notification Service uses the concepts of short-term and long-term subscriptions. A short-term subscription lasts until an application cancels the subscription or the application exits. A long-term subscription survives application restarts. WinFS API watchers are a set of classes that allow applications to be selectively notified of changes in the WinFS store and provide state information that can be persisted by the application to support suspend/resume scenarios.

The Watcher class can notify your application of changes to different aspects of WinFS objects, including the following:

  • Item changes
  • Embedded item changes
  • Item extension changes
  • Relationship changes

When a watcher raises an event, it sends watcher state data with the event notification. Your application can store this state data for later retrieval. Subsequently, you can use this watcher state data to indicate to WinFS that you want to receive events for all changes that occurred after the state was generated.

The watcher programming model also allows any combination of added, modified, and removed events to be disabled. It can also be configured to raise an initial event that simulates the addition of all existing items, item extensions, relationships, and so on.

The WinFS watcher design is broken down into the classes described in the following table.

Class Purpose/Description
WatcherOptions Class for specifying initial scope and granularity options to StoreWatcher
StoreWatcher The quintessential class for watching WinFS items, embedded items, item extensions, and relationships
WatcherState Opaque object that can be used to initialize a StoreWatcher
ChangedEventHandler Class that defines the event handler to be called by StoreWatcher
ChangedEventArgs Class passed as argument to ChangedEventHandler
ItemChangeDetail Base class that provides granular change details for item events
ItemExtensionChangeDetail Class derived from ItemChangeDetail that provides additional change details specific to item extension events
RelationshipChangeDetail Class derived from ItemChangeDetail that provides additional change details specific to relationship events

You use the StoreWatcher class to create a watcher for some item in the WinFS store. The StoreWatcher instance will raise events when the specified item changes. You can specify the type of item and hierarchy to watch. By default, a watcher

  • Does not raise an initial event to establish the current state
  • Watches the item and the hierarchy (including immediate children) for any changes
  • Raises add, remove, and modify events on this item or any child in entire hierarchy
  • Raises add, remove, and modify events for item extensions on this item or any child in entire hierarchy
  • Raises add, remove, and modify events for relationships in which this item or any child in entire hierarchy is the source of the relationship

Because by default a watcher watches for changes in the specified item and its descendants, you might want to specify WatchItemOnly as the watcher option. The following example watches for changes only to the located Person item:

Person p = Person.FindOne (ctx,
            "PersonalNames[GivenName='Rachel' and Surname='Emerald'");
StoreWatcher w = new StoreWatcher ( p, WatcherOptions.WatchItemOnly );

A Folder is just another WinFS item. You watch for changes in a Folder the same way you do for a Person:

Folder f = · · ·
StoreWatcher w = new StoreWatcher (f, <WatcherOptions>);

You can watch for changes in a specified relationship of an item, too:

Person p = · · ·
StoreWatcher w = new StoreWatcher (p, typeof(HouseholdMember),
                                   <WatcherOptions> );
w.ItemChanged += new ChangedEventHandler (ItemChangedHandler);
w.Enabled = true;

// Change notifications now arrive until we unsubscribe from the event
  §
// Now we unsubscribe from the event
w.ItemChanged -= new ChangedEventHandler (ItemChangedHandler);
w.Dispose ();
§

// The change notification handler
void ItemChangedHandler (object source, ChangedEventArgs args) {
  foreach (ItemChangeDetail detail in args.Details) {
    switch (typeof(detail)) {
      case ItemExtensionChangeDetail:
        // handle added + modified + removed events for Item Extension
        break;

      case RelationshipChangeDetail:
        // handle added + modified + removed events for Relationship
        break;

      default:
      case ItemChangeDetail:
        // handle added + modified + removed events for Item or Embedded Item
        HandleItemChangeDetail (detail);
        break;
    }
  }
|

void HandleItemChangeDetail (ItemChangeDetail detail) {
  switch (detail.ChangeType) {
    case Added:          // handle added event
      break;

    case Modified:       // handle modified event
      break;

    case Removed:        // handle modified event
                break;
  }
}

Blob and Stream Support

Blob and stream support APIs are still in flux at the time of this writing. Check the documentation for the latest information about how to access blobs and streams in the WinFS store.

Cursoring and Paging

The various Find methods in the WinFS classes can return a (potentially) large collection of objects. This collection is the equivalent of a rowset in the database world. Traditional database applications use a paged cursor to navigate efficiently within a large rowset. This cursor references a single row (a thin cursor) or a set of rows (a page cursor). The idea is that applications retrieve one page's worth of rows at a time; they can also pinpoint one row within the page for positioned update and delete. The WinFS API provides similar abstractions to the developer for dealing with large collections.

By default, a find operation provides a read-only, scrollable, dynamic cursor over the returned collection. An application can have a fire hose cursor for maximum performance. A fire hose cursor is a forward-only cursor. The application can retrieve a page of rows at a time, but the next retrieval operation will begin with the subsequent set of rows—it cannot go back and re-retrieve rows. In a sense, rows flow from the store to the application like water from a fire hose—hence the name.

The CursorType property in the FindParameters class will allow an application to choose between a fire hose and scrollable cursor. For both fire hose and scrollable cursors, the application can set a page size using the PageSize property of the FindParameters class. By default, the page size is set to 1.

Data Binding

You can use the WinFS data classes as data sources in a data-binding environment. The WinFS classes implement IDataEntity (for single objects) and IDataCollection (for collections) interfaces. The IDataEntity interface provides notifications to the data-binding target of changes to properties in the data source object. The IDataCollection interface allows the determination of the base type of an object in a polymorphic collection. It also allows you to retrieve a System.Windows.Data.CollectionManager, which navigates through the data entities of the collection and provides a view (for example, sort order or filter) of the collection. I discuss data binding in detail in Chapter 5.

Security

The WinFS security model fundamentally grants a set of Rights to a Principal on an Item in the following ways:

  • Security is set at the level of Items.
  • A set of rights can be granted to a security principle on an Item. This set includes: READ, WRITE, DELETE, EXECUTE (for all items), CREATE_CHILD, ADMINISTER, and AUDIT. (Additional rights are grantable on Folder items.)
  • Users and applications are the security principles. Application rights supersede user rights. When an application doesn't have permission to delete a contact, a user cannot delete it via the application regardless of the user's permissions.
  • Security is set using rules; each rule is a Grant and applies to a triplet: (<ItemSet, PrincipalSet, RightSet>).
  • The rules are themselves stored as Items.

Getting Rights on an Item

Each WinFS item class has a method named GetRightsForCurrentUser, which returns the set of rights—READ, WRITE, DELETE, and so forth—that the current user has on the specified item. In addition, the method returns the set of methods that WinFS allows the user to execute.

Setting Rights on an Item

WinFS uses a special Item type, SecurityRule, to store permissions information on Items. Thus, setting and changing rights is no different from manipulating any other Item in WinFS. Here's a code example showing how to set rights on a folder item:

using (ItemContext ctx = ItemContext.Open("\\localhost\WinFS_C$")) {
  SecurityRule sr = new SecurityRule (ctx);
  sr.Grant = true;
  // set permission on items under folder1 including folder1
  sr.AppliesTo = <folder1's Identity Key>; 
  sr.Condition = acl1;   // a DACL
  sr.Save();
}

Extending the WinFS API

Every built-in WinFS class contains standard methods such as Find* and has properties for getting and setting field values. These classes and associated methods form the foundation of WinFS APIs and allow you to learn how to use one class and know, in general, how to use many other WinFS classes. However, while standard behavior is useful, each specific data type needs additional, type-specific behaviors.

Domain Behaviors

In addition to these standard methods, every WinFS type will typically have a set of domain-specific methods unique to that type. (Actually, WinFS documentation often refers to type definitions as schema, reflecting the database heritage of WinFS.) WinFS refers to these type-specific methods as domain behaviors. For example, here are some domain behaviors in the contacts schema:

  • Determining whether an e-mail address is valid
  • Given a folder, getting the collection of all members of the folder
  • Given an item ID, getting an object representing this item
  • Given a person, getting his or her online status
  • Creating a new contact or a temporary contact with helper functions

Value-Added Behaviors

Data classes with domain behaviors form a foundation that application developers build on. However, it is neither possible nor desirable for data classes to expose every conceivable behavior related to that data.

You can provide new classes that extend the base functionality offered by the WinFS data classes. You do this by writing a class whose methods take one or more of the WinFS data classes as parameters. In the following example, the OutlookMainServices and WindowsMessageServices are hypothetical classes that use the standard WinFS MailMessage and Person classes:

MailMessage m = MailMessage.FindOne (…);
OutlookEMailServices.SendMessage(m); 
 
Person p = Person.FindOne (…);
WindowsMessagerServices wms = new WindowsMessagerServices(p);
wms.MessageReceived += new MessageReceivedHandler (OnMessageReceived);
wms.SendMessage("Hello");

You can then register these custom classes with WinFS. The registration data will be associated with the schema metadata WinFS maintains for every installed WinFS type. WinFS stores your schema metadata as WinFS items; therefore, you can update, query, and retrieve it as you would all other WinFS items.

Specifying Constraints

The WinFS data model allows value constraints on types. WinFS evaluates and enforces these constraints when you add items to the store. However, you sometimes want to verify that input data satisfies its constraints without incurring the overhead of a roundtrip to the server. WinFS allows the schema/type author to decide whether the type supports client-side constraint checking. When a type supports client-side validation, the type will have a validate method you can call to verify that an object satisfies the specified constraints. Note that regardless of whether the developer calls the Validate method, WinFS still checks the constraints at the store.

Using the WinFS API and SQL

The WinFS API enables a developer to access the WinFS store by using familiar common language runtime (CLR) concepts. Throughout this chapter, I used the following coding pattern for WinFS access:

  1. Bind to an ItemContext.
  2. Find the desired items.
  3. Update the items.
  4. Save all changes back to the store.

Step 2 is essentially a query to the store. The WinFS API uses a filter expression syntax based on OPath for specifying these queries. In many cases, using filter expressions should be sufficient for most tasks. However, there will be cases where the developer will want to use the full power and flexibility of SQL.

The following capabilities are present in SQL, but they are not available when using a filter expression:

  • Aggregation (Group By, Having, Rollup)
  • Projection (including calculated select expressions, distinct, IdentityCol, RowGuidCol)
  • For XML
  • Union
  • Option
  • Right/full/cross join
  • Nested selects
  • Join to non-WinFS table

It is thus essential that a WinFS developer be able to seamlessly transition between the SQLClient API and the WinFS API, using one or the other in various places in the code.

Aggregate and Group with SQL, and Then Use WinFS API

A small-business owner, Joe, wants to determine who his top 10 customers are and send gift baskets to them. Assume that Customer is a schematized item type. This means that an ISV has provided a schema for the Customer type to WinFS, and therefore, it also means that a WinFS store can now contain Customer items. A Customer item has a holding link to a schematized Order Item type. Order Item has an embedded collection of Line Orders, as follows:

1. using (ItemContext ctx = ItemContext.Open()) {
2. 
3.  SqlCommand cmd = ctx.CreateSqlCommand();
4.  cmd.CommandText = 
5.   "select object(c) from Customers c inner join (" +
6.     "select top 10 C.ItemId, sum(p.price) " +
7.     "from Customers C" +
8.     "inner join Links L on L.SourceId = C.ItemId" +
9.   "inner join Orders O on L.TargetId = O.ItemId" + 
10.    "cross join unnest(O.LineOrders) " + 
11.    "group by C.ItemId" +
12.    "order by sum(p.price)) t ON c.ItemId = t.ItemId"; 
13.
14.  SqlDataReader rdr = cmd.ExecuteReader();
15. 
16.  GiftBasketOrder gbOrder = new GiftBasketOrder(Ö);
17. 
18.  while (rdr.Read()) {
19.   Customer c = new Customer((CustomerData) rdr.GetValue(0));
20.   // add the customer to gbOrder's recipient collection
21.   gbOrder.Recipients.Add(c);
22.  }
23.
24.  // send the order. The ISV's GiftBasketOrder can easily pull out 
25.  // customer info such as shipping address from the Customer object
26.  gbOrder.Send();
27. }                                             

In line 1 of this example, I open a context for the root of the system volume. In line 3, I create a SQL command object that I subsequently use to execute a SQL query against the WinFS store. This command object reuses the connection used by the item context. Lines 4 through 12 construct the query, and line 14 executes the query. The query returns the top 10 customers in the following manner: the SELECT statement in lines 6 through 12 generates a grouped table containing the total value of each customer's orders; the ORDER BY clause on line 12, combined with the TOP 10 modifier in line 6, selects only the top 10 customers in this grouped table.

The GiftBasketOrder class is a custom class that makes use of the WinFS API Customer object. I create an instance of GiftBasketOrder on line 16.

Line 19 uses the SQLDataReader to read the first column of the returned rowset and casts it to a CustomerData object.

When you define a new type in WinFS (known as creating a new schema), you are actually defining two types: your managed class and the WinFS store's persistent format of the class. WinFS always adds the Data suffix to the name of your class to create the name of the store's type. Therefore, for example, when you define a new Customer type that resides in the WinFS store, WinFS creates the parallel CustomerData WinFS User Defined Type (UDT).

The first column of the rowset contains the store's CustomerData object. I pass this object to the constructor of the Customer class, and the constructor initializes the new object from the CustomerData object. This example is typical of using store UDTs to construct WinFS API objects.

Line 24 adds the customer to the Recipients collection of the GiftBasketOrder.

Finally, I use the Send method on gbOrder to "send" this order.

Assume that you want to find the average salary (over a 10-year period) for the CEO of each company in my portfolio. Use the following assumptions:

  • I have a folder named Companies In My Portfolio, which contains items of type Organization.
  • EmployeeData is a link-based relationship, and it has a YearlyEmploymentHistory that has the year and the salary for that year.
1. using (ItemContext ctx = ItemContext.Open(@"Companies In My Portfolio")) {
2.
3.  SqlCommand cmd = ctx.CreateCommand();
4.  cmd.CommandText = 
5.   "select avg( Salary ) from Links l cross apply " +
6.   "( select Salary from unnest( convert(" + 
7.   "EmployeeData,l.LinkCol)::YearlyEmploymentHistory )" +
8.   "where Year >= '1993' ) where l.LinkID = @LinkID";
9.
10. SqlParameter param = new SqlParameter ("@LinkID", SqlDbType.BigInt);
11. cmd.Parameters.Add (param);
12. 
13. Folder f = Folder.FindByPath (ctx, ".");
14. 
15. FindResult orgs = f.GetMembersOfType (typeof(Organization));
16. foreach (Organization o in orgs) {
17.   EmployeeData ed = EmployeeData.FindEmployeeInRole (o,
18.               Organization.Categories.CeoRole);
19.   param.Value = ed.Link.LinkID;
20.   SqlDataReader rdr = cmd.ExecuteReader ();
21.   rdr.Read ();
22.   Console.WriteLine ("{0} ${1}",   
23.    ((Person)ed.Target).PersonalNames[0].FullName, rdr.GetFloat(0) );
24.   rdr.Close ();
25. }
26. }

Line 1 opens a context for the Companies In My Portfolio WinFS share. Lines 3 through 11 create a parameterized SQL query that I can use in the context of the folder. This query returns the average salary for a given employee (represented by the @LinkID parameter). Lines 10 and 11 specify that @LinkID is a parameter of type BigInt. I execute this query later in the example, on line 20.

Line 13 gets a Folder object that represents the folder indicated by the share that I specified when creating the context. Lines 15 and 16 set up the loop for going through the collection of Organization objects in this folder.

For each organization, line 17 gets the EmployeeData object for the CEO.

Line 19 prepares for the query and sets the value of the parameter to the appropriate LinkID, and then line 20 executes the parameterized SELECT.

Line 21 reads the next and only row from the query result, and lines 22 and 23 print the name of the CEO and the 10-year average salary.

Summary

The WinFS data store provides a far richer data storage model than traditional file systems. Because it supports data, behavior, and relations, it's difficult to categorize WinFS as a file system, a relational database, or an object database. It's a bit of all those technologies in one product. WinFS provides a common definition of ubiquitous information that is globally visible and available to all applications running on Longhorn. Applications can leverage the query, retrieval, transactional update, and filtering capabilities of WinFS; therefore, the developer spends less time developing data access and storage code and more time working on unique application functionality.

Continue to Chapter 5: Data Binding