An Introduction to \"WinFS\" OPath

 

Thomas Rizzo and Sean Grimaldi
Microsoft Corporation

October 18, 2004

Applies to:
   Longhorn Community Technical Preview, WinHEC 2004 Build (Build 4074)

Summary: Learn how you can retrieve objects that meet certain criteria from the "WinFS" store by using WinFS OPath filter expressions. (9 printed pages)

UPDATE: In spite of what may be stated in this content, "WinFS" is not a feature that will come with the Longhorn operating system. However, "WinFS" will be available on the Windows platform at some future date, which is why this article continues to be provided for your information.

Disclaimer   Since WinFS is still in development, the software and therefore this information may change. Please be sure to check the latest WinFS documentation for the most up-to-date information.

Download the SimpleRavCodeSample.msi that accompanies this article.

Contents

Information, Information Everywhere! Now, If Only We Could Find It!
WinFS OPath Filtering
Rich Application Views
Summary

Information, Information Everywhere! Now, If Only We Could Find It!

Today, we live in an information-rich world. Our information is stored in many different formats and created and consumed by many different applications. In the end, information is a useful commodity that we use to make better decisions, both in our professional and personal lives.

However, what defines the nebulous term information? Information, concisely defined, is data with semantics. The structure of the data and relationships among the data can provide more insight than the data itself. Knowing that a document is related to an author who is a contact in your system and that contact is part of an organization and in that organization that contact is your boss is very useful relationship information! While much of the structure of the data can be expressed in records and tables, relationships among the data are not expressed well in most database or file systems. Explicitly indicating and persisting relationships among the data and general structure of the data, as you can do with "WinFS," enables you to attach additional semantics to your data.

WinFS types, including relationship types, are extensible, so that you can extend the system to describe information in your organization.

However, all the data in the world, even if relationships are indicated, is not very useful if you can't find and retrieve the information. When you search information stored in WinFS, you can search not only the data but the its structure and relationships as well. WinFS introduces a query language that supports searching the information stored in WinFS called WinFS OPath. WinFS OPath combines the best of the SQL language with the best of XML style languages and the best of CLR programming.

WinFS OPath is part of the WinFS API. Through the API, you can leverage the power of WinFS OPath through a set of sequential operations: bind, find, use, and close. Binding is the act of binding to a context, which scopes the date to search over. WinFS OPath searches information contained only in the WinFS store, so you will want to make sure that you save the information you want to search over into WinFS or replicate this data into WinFS. After scoping the search, the next step is to specify filters to narrow the results to only those you are interested in retrieving. Next, you can manipulate your data by displaying it, modifying it, or deleting it. Don't forget to save changes you might have made to objects stored in WinFS if you want the changes persisted. Finally, ensure that you close your connection to the WinFS store. The rest of this article will step you through using WinFS OPath to find information in WinFS through the use of filters, both simple and complex, and some new WinFS technologies never discussed before.

WinFS OPath Filtering

You can retrieve objects that meet certain criteria from the WinFS store, which might store hundreds of thousands or more heterogeneous objects, by using WinFS OPath filter expressions. Conceptually, WinFS OPath filtering is straightforward. This article provides a conceptual description with some examples.

WinFS OPath expressions are evaluated relative to a context. The context is represented by the ItemContext object, and defines a set of objects to search against. For example, ItemContext.Open() defines the set of objects to search against as the entire WinFS store. The Open method can also be used to specify just a subset of objects to search against. The following lines of code specify the search context as a folder named GoldenGate with a UNC path of \\< MachineName >\DefaultStore\GoldenGate.

string computerName = Environment.MachineName;
ItemContext ctx = ItemContext.Open(
  string.Format(@"\\{0}\DefaultStore\GoldenGate", computerName));

A WinFS OPath filter is a Boolean expression that is evaluated for each candidate object in the context. After you specify where to search using an ItemContext object, you need to indicate what you want to search for. For example, to search for objects of the Document type you could use the flowing code:

ItemSearcher searcher = ctx.GetSearcher(typeof(Document));
Document doc = searcher.FindOne() as Document;

While this simple code might be adequate in some simple scenarios, in most scenarios you will probably want to filter using properties or methods of a type. For example, you can use the Title property of the Document type to find all the Document objects in the context with a specific title. This is where WinFS OPath filter expressions become important. For example, assume that the GoldenGate folder contains a document with the title "Legacy File Systems". A filter might be something like this: Title = 'Legacy File Systems'. For every document in the GoldenGate folder, WinFS checks to see if the title is "Legacy File Systems". Only the objects for which the filter expression evaluates to true will be returned as matches of the filter.

The WinFS OPath filter expressions enable you to limit the objects retrieved from the WinFS store based on their properties and methods, and retrieve related objects stored in WinFS. The syntax for the WinFS OPath filter is an especially exciting topic to many architects at Microsoft, because it is an opportunity to rewrite Transact-SQL from the ground up as an object query language. Given this, the syntax is still developing and actually getting easier and more powerful. The current WinFS OPath syntax includes operators, keywords, and functions, and has similar functionality as Transact-SQL when viewed holistically with other parts of the WinFS API.

The current WinFS OPath operators include most of the C# operators and a few special operators well suited to filtering. The operators matching C# operators are the additive, multiplicative, conditional, equality, relational, and logical operators. These common operators are found in many other languages as well, including C++ and Visual Basic. They enable you to add, multiply, use "OR", determine if operands are equal, determine if an operand is greater than another, and perform bitwise operations. The following is a code example of a simple filter using some of these basic operators:

 "Title = 'Legacy File Systems' || Title !='Where did you save it?'". 

If you don't like the C# style, you could instead use this equivalent filter, which is more like English:

 "Title Equal 'Legacy File Systems' Or Title Not Equal 'Where did you save it?'".

Some of the WinFS OPath operators are uncommon in other popular programming languages, but useful in filtering. One of these is the Like operator, which also exists in Transact-SQL. The following example demonstrates the use of the Like operator with wildcard syntax:

string wildFilter1 = "Title Like 'Title 1%'";

If you supply the Document type as a parameter to the ItemSearcher, this code will search for Document objects that have a title beginning with "Title 1", such as "Title 10", "Title 1000", and so on. Another WinFS OPath operator that is uncommon in procedural languages such as C and Visual Basic is the Exists operator. The Exists operator determines whether a property that is a collection, also called a multiset property, contains something. For example, the OutDocumentAttachmentRelationships property is a collection of DocumentAttachment relationships that you could use to add an attachment to a Document instance. The following code assumes that the variables doc and attachment represent two separate Document objects:

// Add the attachment.           
doc.OutDocumentAttachmentRelationships.AddAttachment(attachment, 
"Attachment");

So, to find the DocumentAttachment with the Name property value of "'Attachment'", you could use the following filter.

string filter_Exists = "Exists(OutDocumentAttachmentRelationships[Name 
= 'Attachment'])";

The Exists operator is obviously quite helpful, because it saves you from having to find all the DocumentAttachment objects that are members of the OutDocumentAttachmentRelationships collection of a document, and then iterating through this set, checking for null, and filtering out the ones that do not have a Name property equal to "Attachment". This would be less productive than writing the above WinFS OPath filter and would likely provide much poorer performance than letting WinFS do this as part of the filter processing.

In addition to the common keywords and operators and some special ones, WinFS OPath also includes some handy functions that you can use as part of your filter. The complete OPath syntax is available in the Longhorn SDK. These useful functions include string functions, such as Substring; date functions, such as AddMonths; aggregate functions, such as avg; and last but not least, mathematical functions, such as System.Math.Acos.

All of this might sound pretty simple, but in practice, WinFS OPath filters can be quite a challenge for many programmers. We can imagine weekly WinFS OPath puzzles that would take us most of the day to solve. Because of this, Microsoft architects are exploring alternate syntax to simplify the more complex filters. Personally, we think that once you understand the concepts, the syntax is by and large irrelevant—but we recognize that an elegant syntax can making learning the concepts easier and using them more productive. To help you explore the concepts, we have included a couple of real-world—that is, fairly complex—WinFS OPath filter expressions:

// Using a bitwise filter always seems to stump some people.
string filterBitwise = "(EditingTime&4)!=1";
int countBitwise = Count<Document>(filterBitwise);
Console.WriteLine("Found {0} document(s) matching a bitwise filter.", 
countBitwise);

In this example, we wrapped the actual searching in a handy method called Count that returns the number of objects in the specified context that meet the filter criteria. Because programmers are an inherently curious bunch, here is my Count method:

static int Count<T>(string filter)
{
  using (ItemContext ctx = ItemContext.Open())
  {
    using (FindResult results = ctx.FindAll(typeof(T), filter))
    {
      int count = 0;
      foreach (T item in results)
      {
        Debug.Assert(item != null, "Item was null.");
        ++count;
      }
      return count;
    }
  }
}

Note that the WinFS API includes several other ways to determine the number of objects in the store. Perhaps the most obvious is the instance GetCountFromStore method of the ItemSearcher type.

The following is a somewhat more complex filter using the WinFS OPath alias syntax that finds all the folders that exist that have a direct parent folder with the same name:

// This filter finds folders with a name the same as the folder it is 
nested inside, 
//such as:  \\seangrimaldi 02\DefaultStore\Snippet Folder\Snippet Folder
// using an OPath alias syntax 
filter =
   "Exists(##folder.InFolderMemberRelationships.Folder[DisplayName" + 
   " = #folder.DisplayName])";
count = Count<Folder>(filter);
Console.WriteLine("Found {0} folder matching the filter using Alias.", count);

Rich Application Views

As much fun as the challenge of WinFS OPath filters might be right now for strong programmers, there are some cases in which even complex filters do not enable you to perform effective queries, require that you perform multiple queries, or simply do not perform as well as you might wish. In these cases, you can retrieve information from the WinFS store that meets certain criteria by using a related WinFS technology called Rich Application View (RAV). RAV is a new and as yet undocumented WinFS feature. RAV is one of the technologies that Windows Explorer will use to display information stored in WinFS.

For example, you might want to retrieve the first and last names of all persons, their households, and their incomes. Perhaps you want to bind this data to a particular UI, where the user may want to group people by household and get a combined income for each household.

If this seems a bit odd to think about abstractly, here is a screenshot of Outlook grouping e-mail by the From field, then by Subject, then by Received while also sorting the e-mails by each of these properties, as shown in Figure 1.

Figure 1. Outlook grouping and shorting by From, Subject, and Received properties

In scenarios like this, the data could be retrieved using complex OPath filters demonstrated above, but this would require multiple queries, and custom data structures to be written in order to perform the grouping, aggregation, and data binding. If the data set were large, the overhead of having to retrieve entire items when only a couple of properties were needed could cause a performance problem. In comparison, RAV provides a pleasant way to handle such scenarios. Before attempting to run the example code that accompanies this column, make sure that you have Document objects stored in you local WinFS store that meet these search criteria, such as having a not null value of the Title property. If you do not already have documents stored in WinFS that meet these criteria, you could add documents in several ways. Perhaps the easiest way is just to drag and drop Word documents, that have a value for the title meta data, into the WinFS DefaultStore share. This share is by default located at \\computername\DefaultStore; for example: \\sgrimaldi03\DefaultStore.

// Open, simple query, and close with RAV.
using (ItemContext ctx = 
ItemContext.Open("\\\\sgrimaldi02\\DefaultStore\\Snippet Folder"))
{
  try
  {
     ViewDefinition def = new ViewDefinition();
     // Retrieve only Title and ShellPath, rather than the
     // whole Document object.
     def.Fields.Add("Title");
     def.Fields.Add("ShellPath");

     ItemSearcher searcher = Document.GetSearcher(ctx);
     ApplicationView view = searcher.CreateView(def);
     VirtualViewRecordCollection coll = new VirtualViewRecordCollection(view);

     string title = null;
     string path = null;

      foreach (ViewRecord record in coll)
      {
        title = record["Title"] as string;
        path = record["ShellPath"] as string;

        Console.WriteLine("Title:{0}\t\tPath:{1}", title,path);
      }
  }
  catch (Exception ex)
  {
     // Do necessary steps here.
     Console.WriteLine(ex.Message);
  }
}

This example does not retrieve the whole Document object stored in WinFS, but does retrieve the Title and ShellPath property values of the Document objects. This is an example of using RAV to retrieve information from the WinFS store, but it doesn't use WinFS OPath.

Because RAV is so powerful, it would take hundreds of pages to explain everything about RAV. Maybe that will have to wait a few months. All you need to know for now is that you can retrieve information from WinFS using OPath filters and using other related technologies, like RAV.

Summary

This article covers filtering with WinFS OPath and introduces a related filtering technology named RAV. It describes how you can retrieve objects that meet certain criteria from the WinFS store by using WinFS OPath filter expressions. It also introduces RAV, which in addition to being very powerful, has never been written about before.

Although RAV is currently is still an undocumented feature, you can read more about WinFS OPath here:

If you are interested in data with semantics, you might want to read the 15 year-old, impeccable book on this topic: Data With Semantics: Data Models and Data Management, ISBN 0442318383.

The next column in this series will cover using Win32 with WinFS. Just because WinFS has a new programming model, Microsoft architects have not overlooked Win32-based applications. Actually, they have extended Win32 to support WinFS. Stay tuned as the WinFS adventure continues!

 

The WinFS Files

Sean Grimaldi is currently most interested in object relational mapping. Sean has worked on WinFS since September 2002. He is a prolific reader and writer, and is currently writing a book on programming Longhorn in his spare time. Outside of programming and writing, Sean likes bicycling and was there to cheer Lance on at the L'Alpe d'Huez. You can reach Sean at sgrimald@microsoft.com.

Thomas Rizzo is a director in the Microsoft SQL Server group. In his spare time, Tom writes books on programming for Microsoft Press, helps customers on the Microsoft newsgroups, and occasionally updates his blog (which he should do more often!). You can reach Tom at thomriz@microsoft.com.