The Importance of Metadata: Reification, Categorization, and UDDI

 

Karsten Januszewski
Microsoft Corporation

September 2002

Applies to:
   Microsoft Windows Server 2003 RC1
   UDDI Business Registry
   XML Web services

Summary: Look behind the UDDI metadata structure to see how to best employ it within a UDDI registry, both in the UDDI Business Registry (UBR) and in UDDI Services of Microsoft Windows Server 2003; see how to create custom categorization schemes that allow users to solve particular problems in description and discovery. (16 printed pages)

Contents

Overview
Why Categorization Matters
A Close Reading of the UDDI Specification
How Categorization Works In the Universal Business Registry (UBR)
How Categorization Works in UDDI Services of Windows Server 2003
Conclusion

Overview

Categorization is arguably the most important feature of Universal Description, Discovery and Integration (UDDI), yet it is the least understood. The ability to attribute metadata to services registered in UDDI, and then run queries based on that metadata is absolutely central to the purpose of UDDI at both design time and run time. This article will explain the thinking behind the UDDI metadata structure and then demonstrate how to best employ that metadata structure within a UDDI registry, both in the UDDI Business Registry (UBR) and in UDDI Services of Windows Server 2003. It explains how to create custom categorization schemes that allow users to solve particular problems in description and discovery.

Why Categorization Matters

UDDI's raison d'être is for the purpose of description and discovery and, as such, the ability to perform searches and queries based on properties and attributes is critical. If data cannot be found or understood, that data is functionally non-existent or worse, misleading. Data is worthless if lost within a mass of other data. Even if an entity is discovered, if the user cannot determine the context about the entity—how it is supported, who owns it, what it does, etc.—the user cannot effectively interact with the entity. In other words, being able to distinguish and differentiate data is as important as being able to find data. As a UDDI registry expands, this ability becomes increasingly important or else the data registered in UDDI risks being lost in a morass of information.

The Reification of Data

The problematics of discovering and differentiating data is certainly not new in the software community. Often, the problematics are discussed as the reification of data or, in other words, the "making real" of data. In order for data to be meaningful, the context of that data must be understood. As such, the reification of data is central to the UDDI mission to provide a service discovery infrastructure. Providing the ability to structure and model data and metadata to solve this problem is at the heart of the UDDI design, so as to prevent data from being undiscoverable or misunderstood. Perhaps most importantly, Web services clients can make decisions about the Web services they bind to at run-time based on metadata attributed to those services. Using UDDI metadata engenders a dynamic configuration methodology to Web services software architecture.

UDDI provides typed metadata through several means:

First, three of the four central entities in UDDI (providers, services and tModels) can be adorned with what might be thought of as property bags: collections of typed name/value pairs that describe that given entity. Each of the properties in the bag comes from a known classification system. For example, a service might be adorned with a bag which states that the service is (1) located in the United States, (2) available for use on a 24/7 basis and (3) a financial service. The property that states this service is in the United States comes from a geography classification system, whereas the property about service-level agreement comes from a Quality of Service classification system with entirely different set of values. Adorning UDDI entities with these property bags provides entities with the critical metadata and context that can be used to discover and consume them.

The corollary to adorning an entity with properties is the ability to search for that entity based on those properties. The UDDI API was designed to support a complex range of queries based on metadata ascribed to these bags. Queries are written that look for properties based on the classification scheme they are associated with. In other words, in writing a query "to find services in the United States", one must provide not only the appropriate value that represents the United States but also the classification scheme from which that value originates. In this way, queries can be written that have contextual intelligence about the properties being searched for.

Other features make the UDDI query engine able to handle a range of scenarios. For example, queries can do an exact match of all the properties in a bag or can match just one property in a bag. Or, a query can search across bags contained in both providers and services. The querying capacity in the UDDI API provides a great deal of flexibility in terms of writing focused, precise queries.

Through these two parallel facilities—adorning properties to entities and searching for entities based on well-known properties—UDDI entities are reified. Below, the article will delve into exactly how to accomplish this. But first, a discussion of the value of multiple categorization systems is in order.

The More the Merrier: The Use of Multiple Classification Systems

What makes UDDI particularly powerful is that there are no limits on the number of different classification schemes that can be created. One can create an unbounded number of classification schemes and an unbounded number of values for that scheme. The context for that scheme is completely determined by the person who creates it. One is not forced to pick from a set of predefined systems, but rather one has complete autonomy and agency in defining the semantic meaning of multiple metadata schemes.

For example, there could be three different Quality of Service classification schemes that have different values and meanings. Or, there could be multiple geographic schemes: one for where the service is physically located, one for one for the areas that service covers, etc.

The power of being able to create and employ multiple classification schemes is not to be underestimated. The ability to overlay and search for multiple properties from classification schemes across a single piece of data provides a huge benefit accurately representing an entity in all its multifaceted complexity. One has the capability to select from a range of different classification schemes, orchestrating a marriage of different values into a factorial of combinations. This confluence of classification schemes thus empowers one to fully model and represent entities. While each individual property is important, the unique combination of these properties is what makes the profile of an entity meaningful.

For example, a user might be interested in finding a service that is (1) available 24/7, (2) located in the US and (3) for performing a financial service. Matching on just one of these properties is not as powerful as being able to match on the combination of properties.

A Close Reading of the UDDI Specification

In order to explain exactly how categorization works, it is important to take a close look at the UDDI specification itself and the XML representations of UDDI entities. Without drilling into the details of the data structures and API structures, the ability to take advantage of the UDDI typed metadata system is compromised. As such, the next section of the paper will spend time looking at exactly how the UDDI Specification handles categorization.

It's All In The Bag

The first order of business is to take a close look at how these "properties" are added to the "bag" discussed earlier. In UDDI, these bags or collections are known as categoryBag elements and the properties are known as keyedReference elements. The best way to understand this is to look at an example UDDI entry. The sample below is the businessEntity of a company that has categorized themselves as a software provider located in California:

(1)<?xml version="1.0" encoding="utf-8"?>
(2)<businessEntity businessKey="…" xmlns="urn:uddi-org:api_v2">
(3) <name>Company</name>
(4) <categoryBag>
(5)  <keyedReference 
(6)      tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2"
(7)      keyName="NAICS: Software Publisher" 
(8)      keyValue="51121" />
(9)   <keyedReference 
(10)      tModelKey="uuid:4e49a8d6-d5a2-4fc2-93a0-0411d8d19e88"
(11)      keyName="California"
(12)      keyValue="US-CA" />
(13) </categoryBag>
(14)</businessEntity>

There are several things to take note of here. First, the categoryBag element in (4) can contain an unbounded number of keyedReference elements. Second, note how there are two keyedReference elements, (5) and (9), each with a different tModelKey. The tModelKey attribute, (6) and (10), in each keyedReference contains a GUID. This tModelKey is the identifier of a unique tModel that exists in the UDDI registry. TModels are constructs in UDDI used to represent concepts or other constructs. In database terms, one might think of the tModelKey as a foreign key that refers to another table. In this case, the tModelKey GUID identifies a unique classification scheme.

An immediate question one might have is, "How do you know what classification system is represented by a given tModelKey?" The answer is to use a UDDI API call to look up the tModel based on its tModelKey. To return to the database analogy, this is the equivalent of tracking down the table from which the foreign key derived. In UDDI, the appropriate API call to use would be get_tModelDetail in the body of a SOAP message as follows:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
   <Body>
      <get_tModelDetail generic="2.0" xmlns="urn:uddi-org:api_v2">
         <tModelKey>uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2</tModelKey>
         <tModelKey>uuid:4e49a8d6-d5a2-4fc2-93a0-0411d8d19e88</tModelKey>
      </get_tModelDetail>
   </Body>
</Envelope>

Note how each GUID is passed as the value of the tModelKey element. The result of this API call would return the tModelDetail for these two classification schemes. By issuing such a call, one would discover that in (6), the tModelKey refers to the North American Industry Classification System (NAICs), while in (8), the tModelKey refers to the ISO 3166 Geographic Taxonomy. Other important human information can be discovered such as how the classification scheme is used and where can one go to find out more information about using it. The tModelKey in each keyedReference is absolutely essential to providing context for what that keyedReference represents.

Returning to our sample categoryBag, take a look at the keyValue attribute in (8). The value "51121" is the piece of metadata that is associated with the entity from the NAICs classification scheme. The value is a number that has no obvious meaning on first glance. The only way this value will have meaning is if it can be placed in context with other values and with human understandable information. Also note in (7) there is a keyName attribute. This keyName provides some context for what the keyValue in fact means. The keyName attribute is entirely used for human contextual understanding and has no relevance in terms of searching or querying. The only important values in a search or query are the keyValue and the tModelKey.

(One special case where the keyName is relevant is the general_keywords taxonomy; see the UDDI.org Web site for more on this exception to the rule.)

Take a look at the other keyedReference in the sample. In (12), we find the keyValue "US-CA" and in (11) we find the keyName, "California". Again, the important attribute here is the keyValue, because it will be used in all searches. The keyName is only relevant for human understanding.

So, to sum up, metadata is attributed to UDDI entities in keyedReference elements. The keyedReference element contains three attributes: keyValue, keyName and tModelKey. The keyValue contains the searchable property, while the keyName is just for human use. The tModelKey is used to know which categorization scheme that property came from.

Match-Making: Using the UDDI API To Discover Data

Moving on, consider now the UDDI API queries that can be written to search for the sample entity presented above. Because the entity is a businessEntity—also known as a provider—we will issue find_business queries to discover this entity. Note that the same techniques discussed around the find_business API apply to queries for services (find_service) and tModels (find_tModel).

First, consider a query for providers in the software industry. One might issue the following query:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
   <Body>
      <find_business xmlns="urn:uddi-org:api_v2" generic="2.0">
         <categoryBag>
            <keyedReference 
keyValue="51121" 
tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" />
         </categoryBag>
      </find_business>
   </Body>
</Envelope>

The sample provider above would match this query and be returned in the result set because the sample provider has a keyedReference that exactly matches the keyedReference specified in the query.

It is important to call out what a user had to know in advance of issuing this query. First, it assumes that the user knows the appropriate keyValue for software industry ("51121"). Second, it assumes the user knows the tModelKey for the NAICs system. Now, there are many ways a user might have acquired those pieces of information (Web user interface, a priori knowledge, etc.) but it is important to recognize that effective queries cannot be issued without these two critical pieces of information: the keyValue of interest and the tModelKey of the classification scheme.

Now consider a query that will attempt to discover providers registered as software publishers and providers that are in the state of Oregon:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
   <Body>
      <find_business xmlns="urn:uddi-org:api_v2" generic="2.0">
         <categoryBag>
            <keyedReference 
keyValue="51121" 
tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" />
            <keyedReference 
keyValue="US-OR" 
tModelKey="uuid:4e49a8d6-d5a2-4fc2-93a0-0411d8d19e88" />
         </categoryBag>
      </find_business>
   </Body>
</Envelope>

This query also will not return our sample provider because, although the provider matches the NAICS code, it does not match the ISO geographic code. There is a logical AND between keyedReference elements when matching. By default, all keyedReference elements passed in a bag will contain a logical AND between them.

However, this default AND behavior can be overridden by using a findQualifier in the query as follows:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
   <Body>
      <find_business xmlns="urn:uddi-org:api_v2" generic="2.0">
        <findQualifiers>
            <findQualifier>orAllKeys</findQualifier>
        </findQualifiers>         
<categoryBag>
            <keyedReference 
keyValue="51121" 
tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" />
            <keyedReference 
keyValue="US-OR" 
tModelKey="uuid:4e49a8d6-d5a2-4fc2-93a0-0411d8d19e88" />
         </categoryBag>
      </find_business>
   </Body>
</Envelope>

The orAllKeys findQualifier will OR the keyedReference elements, thus returning our provider.

One question that often is asked is matching behavior of values within the same classification scheme. Will the UDDI registry match on values that are considered parents of a given value? Will the UDDI registry match on children of a given value? For example, if the user knows that the keyValue "US" is the parent value of "US-CA", will there be a match? Consider the following query:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
   <Body>
      <find_business xmlns="urn:uddi-org:api_v2" generic="2.0">
         <categoryBag>
            <keyedReference 
keyValue="US" 
tModelKey="uuid:4e49a8d6-d5a2-4fc2-93a0-0411d8d19e88" />
         </categoryBag>
      </find_business>
   </Body>
</Envelope>

The above query will not return our sample provider, because keyedReference matching is done only on the specific keyValue passed in the categoryBag of the API call. The UDDI registry provides no wildcarding or intelligence about the relationship between values in a classification scheme.

This has significant ramifications on how publishers model. If a publisher wants to ensure they are discovered on an optimal number of searches, the publisher must classify data not only with the appropriate value from the classification scheme, but also pertinent parent and children values. For example, a publisher might have two keyedReference elements, one for "US" and one for "US-CA", so that searches for providers in the "US" will match.

Inquirers also must be aware of this matching behavior, as not all publishers will optimize their entities for discovery. As such, if inquirers want to issue the broadest queries possible, they should write queries that use the orAllKeys findQualifier and submit a number of different keyedReference elements. For example, rather than just searching for services in "US-CA", a query might be issued for services in "US", "US-CA" and all the children of "US-CA" so that the broadest number of matches are returned.

Creating a New Classification Schemes

Publishing and inquiring based on categorization is one-half of classification in UDDI; the other half is creating classification schemes themselves. Creating, advertising and promoting classification schemes in UDDI is an important facet of capitalizing on the capabilities of UDDI. At its simplest, creating a classification scheme is as easy as creating a new tModel in UDDI and then using the tModelKey of the new tModel in a keyedReference. Consider the following save_tModel API:

(1)<?xml version="1.0" encoding="UTF-8"?>
(2)<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
(3) <Body>
(4)   <save_tModel generic="2.0" xmlns="urn:uddi-org:api_v2">
(5)      <authInfo>…</authInfo>
(6)      <tModel tModelKey="">
(7)         <name>New Classifcation Scheme</name>
(8)         <overviewDoc>
(9)            <overviewURL>
(10)               http://moreinfo
(11)            </overviewURL>
(12)         </overviewDoc>
(13)         <categoryBag>
(14)            <keyedReference 
(15)               tModelKey="uuid:c1acf26d-9672-4404-9d70-39b756e62ab4"
(16)               keyValue="categorization"/>
(17)         </categoryBag>
(18)      </tModel>
(19)   </save_tModel>
(20) </Body>
(21)</Envelope>

Issuing this API will create a new tModel. Taking a closer look at the contents of this tModel will explain what exactly is happening. When saved, the UDDI registry will generate a new key, inserting it in the tModelKey attribute of (6). The name listed in (7) is simply for human usage. The overviewURL in (10) provides a resource for users of this classification scheme. Note how the tModel itself is categorized. Because tModels have many different uses in UDDI, this tModel has been classified itself as a categorization tModel so as to distinguish it from other kinds of tModels. The tModelKey in (15) is a canonical tModelKey for the uddi-org:types taxonomy, a classification scheme for tModels themselves that is specified in the UDDI specification, and the keyValue in (16) also comes from the UDDI specification, which outlines a predefined value set for this taxonomy. (See the UDDI.org Web site for more on the uddi-org:types taxonomy.)

As such, creating a new classification scheme is as simple as saving a tModel. Once it has been saved, it can be used to classify providers, services and tModels as discussed above. However, the story doesn't end with simply saving a tModel. In order to fully understand categorization, a discussion of checked vs. unchecked categorization is in place.

Checkmate: Checked vs. Unchecked Categorization

Categorization schemes in UDDI can be either checked or unchecked. A checked categorization scheme means that the UDDI registry will perform a validation routine on any keyValue associated with that categorization scheme used in any UDDI publish or inquiry API call. If that value is not part of the scheme, the API message will fail with an error message explaining to the user that the value is not valid.

In order to perform this check, the UDDI registry must have knowledge of all the valid values in that taxonomy. For example, consider the NAICS taxonomy discussed above. Assume that it has been saved into a UDDI Registry as a checked taxonomy and a user attempted to save a keyedReference as follows:

   <keyedReference 
keyValue="xyz" 
tModelKey="uuid:c0b9fe13-179f-413d-8a5b-5004db8e5bb2" />

This save would fail if the keyValue xyz is not valid according to the NAICS taxonomy. Before committing the data to the registry, the registry would check this value and throw a SOAP fault back to the user, informing the user that the value was illegal within the NAICS taxonomy.

An unchecked categorization scheme works differently: in this case, the UDDI registry does not do any contextual check on the keyValue. Any value can be used as a keyValue, as long as it conforms to the string datatype in XSD and is less than 255 characters, per the UDDI Specification.

There are some interesting consequences of these two approaches to categorization. Checked vs. unchecked classification schemes have distinct pros and cons depending on the situation and use case.

The main benefit of using checked categorization is that the UDDI server knows all the valid values for that taxonomy system. With this knowledge, the UDDI server can prevent "bad" data from being saved in association with that scheme. But beyond simply validating that the value is allowed in that system, a UDDI registry with a checked taxonomy can do some more interesting things. In particular, it can display those values to a user. Most UDDI registries have accompanying Web user interfaces and those user interfaces can take advantage of that value set. It may even display those values to the user in a way that suggests relationships between the values, like parent-child hierarchical relationships.

However, checked taxonomies present two main problems. First, there is the possible difficultly of creating a list of values in advance. Taxonomies are often thorny to compose and are a source of much controversy as conflicts erupt around how a particular taxonomy should in fact be architected. Creating a taxonomy is not necessarily a trivial task. The second issue with checked taxonomies is one of versioning. Taxonomies have a tendency to evolve, mutate and morph, meaning that the list of valid values likely will change over time. Managing this process can raise some issues. How does one handle entities that have been categorized with a value no longer relevant for the taxonomy? How are new values introduced into the taxonomy?

The main benefit of using unchecked categorization is that there is no constraint on what is placed in the keyValue. Issues around versioning and validation are thus now outside of the domain of the UDDI Registry and can be handled elsewhere. Another benefit is that the metadata placed in the keyValue can now vary widely. For example, imagine a category system that simply provided a keyValue that was a timestamp or a GUID. The keyValue would not be limited to a value set determined in advance.

The downside of unchecked categorization is that the UDDI Server may not be able to display to the user values that are appropriate for that category system. Another down-side is the possibility that "bad" data will enter the registry.

Depending on the situation, one must consider carefully whether to use checked or unchecked categorization. Another major factor to consider between these two options is whether the UDDI Registry in question is the UBR or UDDI Services in Windows Server 2003. How the policies and procedures work in these two different environments will be addressed next.

How Categorization Works In the Universal Business Registry (UBR)

In the UBR, there are currently four checked taxonomies as identified in the UDDI Specification: the uddi-org types taxonomy, the NAICS taxonomy, the ISO 3166 taxonomy and the UNSPSC taxonomy. More on these checked taxonomies can be found on the UDDI.org Web site. At this time, no other checked taxonomies exist in the UBR. To create a checked taxonomy in the UBR, one must contact the UBR Operator's Council and initiate a discussion with the operators of the different nodes in the UBR. As such, one should think carefully before considering the introduction of a checked taxonomy into the UBR.

At the Microsoft node of the UBR, there are additional category systems that can be browsed, such as the Standard Industry Classification (SIC), the Microsoft GeoWeb taxonomy and the Microsoft® Visual Studio® .NET Web Service Search Categorization. Because these taxonomies appear alongside the checked taxonomies discussed above, one may be led to believe that these taxonomies are also checked. However, this is not the case: these taxonomies are unchecked, but are available for browsing through the Microsoft user interface as a value-added service to users of the Microsoft UI.

If one wanted to create an unchecked categorization in the UBR, one can certainly do so by simply saving a tModel as discussed above and advertising its usage to a broader community. In fact, many people have already done so. Issuing the following query to UDDI will return all the various classification schemes registered in the UBR:

<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns="https://schemas.xmlsoap.org/soap/envelope/">
   <Body>
      <find_tModel generic="2.0" xmlns="urn:uddi-org:api_v2">
         <categoryBag>
            <keyedReference 
tModelKey="uuid:c1acf26d-9672-4404-9d70-39b756e62ab4"
keyValue="categorization"/>
         </categoryBag>
      </find_tModel>
   </Body>
</Envelope>

Presumably, the overviewURLs of the various tModels returned by this query would lead a user to more information about how to use these taxonomies, what values are appropriate, etc. Again, it is important to note that it is the responsibility of the person who created these tModels to advertise and publicize how to use these categorization schemes.

How Categorization Works in UDDI Services of Windows Server 2003

The procedures and policies that are enforced in the UBR are quite different from the facilities and possibilities of UDDI Services running under Windows Server 2003. While the mechanism for creating unchecked categorization schemes is identical to the process of the UBR (simply saving a tModel and advertising its usage), the mechanism for supporting checked taxonomies is enhanced significantly. The opportunity to create checked classification schemes is facilitated by tools and features of UDDI Services in Windows Server 2003, because ability to create, import, manage and browse classification schemes is a major feature of UDDI Services. Special user interface and API capabilities have been introduced into the product, making categorization a powerful feature.

Creating, Importing and Managing a Categorization Scheme

In order to perform validation on a checked classification scheme, UDDI Services must be aware of all the possible values within that scheme. As such, creating a classification scheme in UDDI Services involves more than just saving a tModel. The additional work of creating all the valid values for that taxonomy is incumbent on the creator, as well as modeling the relationships between those values. In the UDDI Services Resource Kit (available with the RTM of Windows Server 2003), a tool will be available to create and manage taxonomies. However, one can get started without the tool. Because the UDDI specification itself did not specify a way to represent values and their relationships, an extended schema is provided for the purposes of creating checked classification schemes. It can be found at \inetpub\uddi\extensions.xsd in a default installation of UDDI Services from Windows Server 2003 RC1.

Consider the creation of a geographical classification scheme for a university. A fragment of such a scheme might look as follows:

Figure 1. Sample geographical classification scheme

Below is a sample of an XML file based on this schema that represents this scheme. It contains three levels—Campuses, Buildings and Rooms—and looks as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Sample categorization scheme -->
<resources xmlns="urn:uddi-microsoft-com:api_v2_extensions" xmlns:uddi="urn:uddi-org:api_v2">
   <categorizationSchemes>
      <categorizationScheme checked="false">
         <uddi:tModel tModelKey="">
            <uddi:name>Sample Geographic Categorization Scheme</uddi:name>
            <uddi:description xml:lang="en">This categorization scheme is 
                  intended for describing buildings and rooms on a campus.
</uddi:description>
            <uddi:categoryBag>
               <uddi:keyedReference 
tModelKey="uuid:c1acf26d-9672-4404-9d70-39b756e62ab4" keyName="types" keyValue="categorization"/>
            </uddi:categoryBag>
         </uddi:tModel>
         <categoryValue keyValue="0" keyName="Main Campus" isValid="false" parentKeyValue=""/>
         <categoryValue keyValue="1" keyName="Building 1" isValid="false" 
parentKeyValue="0"/>
         <categoryValue keyValue="2" keyName="Building 2" isValid="false" 
parentKeyValue="0"/>
         <categoryValue keyValue="3" keyName="Building 3" isValid="false" 
parentKeyValue="0"/>
         <categoryValue keyValue="1/10" keyName="Room 10" isValid="true" parentKeyValue="1"/>
         <categoryValue keyValue="1/11" keyName="Room 11" isValid="true" parentKeyValue="1"/>
         <categoryValue keyValue="1/12" keyName="Room 12" isValid="true" parentKeyValue="1"/>
      </categorizationScheme>
   </categorizationSchemes>
</resources>

Note how this sample contains semantics from the urn:uddi-org:api_v2 namespace, prefixed with uddi:, used for the tModel portion of the classification scheme. In addition, it contains the urn:uddi-microsoft-com:api_v2_extensions namespace, under which the actual values for the taxonomy are represented. Let's take a closer look at a portion of the categoryValue element, which is used to hold these values:

(1)   <categoryValue 
(2)       keyValue="0" 
(3)      keyName="Main Campus" 
(4)      isValid="false" 
(5)      parentKeyValue=""/>
(6)   <categoryValue 
(7)       keyValue="1" 
(8)      keyName="Building 1" 
(9)       isValid="false" 
(10)      parentKeyValue="0"/>
(11)   <categoryValue 
(12)      keyValue="1/10" 
(13)      keyName="Room 10" 
(14)      isValid="true" 
(15)   `   parentKeyValue="1"/>

The categoryValue element is similar to the keyedReference in that it shares the attributes keyValue and keyName. In fact, they are semantically equivalent and, for all intensive purposes, the schema could have used the attributes from the urn:uddi-org:api_v2 namespace. The two new attributes are the isValid attribute and the parentKeyValue attribute. The isValid attribute is used to determine if that value can be selected in the taxonomy or if the value is only there for the purposes of browsing.

For example, in (4), the isValid attribute is false because this taxonomy doesn't let the user select Main Campus as a valid value; the taxonomy forces the user to select only Rooms, and not Campuses or Buildings. As such, (14) is set to true, allowing Room 10 to be chosen as a value.

The other new attribute, parentKeyValue, allows a relationship to be established between different keyValues. For example, (15) states that its parentKeyValue is 1, which turns out to be the keyValue in (7). In other words, Building 1 is the parent of Room 10. Similarly, the parentKeyValue in (10) is 0, which refers to the keyValue in (2), or the parent of Building 1 is Main Campus. Finally, note that Main Campus does not have a parentKeyValue. This signifies that it is a top-level root node in the hierarchy. These relationships values are used by the user interface to display the values of a classification scheme in a hierarchical fashion.

Once a scheme is successfully created, it needs to be imported into the instance of UDDI Services. To import a scheme, the user must be assigned to the administrator role of UDDI Services. The import can be done either through the UDDI Services data import feature of the Web user interface or through a tool called Bootstrap.exe, located in the inetpub\uddi\bin directory of a UDDI Services installation. More on importing and administering categorization schemes can be found in the online help of UDDI Services.

A scheme can be imported with a blank tModelKey or tModelKey and can be supplied by generating a GUID and placing it in the tModelKey attribute. If a blank tModelKey is used, the system will generate a new key for that classification scheme. If GUID is generated and used, care should be taken to generate a unique GUID and a tool such as Guidgen.exe should be used. If updating a categorization scheme that already exists, be sure to use the same tModelKey.

Conclusion

Classification and typed metadata is key to the ability of UDDI to solve the problems of reification of data both in the enterprise and in the public sphere. Well-architected Web service software applications will employ UDDI as an infrastructure, taking advantage of the many possibilities of employing this complex categorization system to different entities for both design-time and run-time usage.