Chapter 3: Customizing and Extending Microsoft Office SharePoint 2007 Search (Part 2 of 2)

This article is an excerpt from Inside Microsoft Office SharePoint Server 2007 by Patrick Tisseghem, from Microsoft Press(ISBN 9780735623682, copyright Microsoft Press 2007, all rights reserved). No part of this chapter may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, electrostatic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews. This excerpt is part 2 of 2.

Previous step: Chapter 3: Customizing and Extending the Microsoft Office SharePoint 2007 Search (Part 1 of 2)

Contents

  • The Search Query Object Model

  • Custom Security Trimming

  • Summary

  • About the Author

The Search Query Object Model

Microsoft Office SharePoint Server 2007 introduces a brand new object model to support developers who need to programmatically prepare and execute a search query. Before describing these new classes, I'll quickly discuss the two ways to express the query. At the end, there is a walkthrough showing how to execute a search via a custom Web Part that can be dropped on SharePoint sites.

Keyword Syntax versus Enterprise SQL Search Query Syntax

A search query can be expressed by building a simple string consisting of keywords or by making use of the powerful, but more complex, Enterprise SQL Search query language that extends the standard SQL-92 and SQL-99 database query syntax and is tuned for compiling full-text search queries.

Keyword Syntax

The keyword syntax is a new query syntax introduced with Microsoft Office SharePoint Server 2007 that greatly reduces the complexity of formulating a search query. Of course, reducing complexity in the syntax also has as a catch: Search queries using the keyword syntax do not fully leverage the power of the search engine. However, as this is often not required, keyword syntax is quickly becoming a popular query language that targets not only the end user but also the developer that builds custom search experiences for the end user. It is also a query language that is consistently used for executing Office, Windows, and Microsoft Live searches.

The core of this query syntax is, of course, the keywords. There are three possible options:

  • The query consists of the keyword expressed as a word including one or more characters without spaces or punctuation.

  • Multiple keywords can be combined into a phrase separated by spaces, but the words must be enclosed in quotation marks.

  • The last option is to use a prefix, including a part of a word, from the beginning of the word.

There is support for required and excluded terms. To include a required term, you simply use the "+" in front of the keyword. To exclude a term, you use the "-" in front of the keyword. Here is an example of a query trying to find everything that contains the keyword business but excluding the results where business is used in the phrase Business Data Catalog.

business -"Business Data Catalog"

Enterprise SQL Search Query Syntax

This syntax allows for more complex search queries and is therefore also used in more advanced search solutions. The basic skeleton of a SQL search query follows.

SELECT <columns>
FROM <content source>
WHERE <conditions> 
ORDER BY <columns>

The Microsoft Office SharePoint Server 2007 Software Developer Kit (SDK) contains a detailed description of what your options are for the artifacts available for each of the different parts of the query. I'll just be brief here and discuss the most important ones you need for the rest of the chapter.

Within the SELECT part, you specify the columns you want to include in the search results. The FROM part can only contain the SCOPE() statement in Microsoft Office SharePoint Server 2007. The name of the scope is then further defined in the WHERE part as follows.

SELECT title, author, rank FROM scope() WHERE "scope"='All Sites'

Using the scope as a possible condition in the WHERE part is not the only option. The WHERE part is also the place where you include the columns and the possible conditions they have to match to. All types of predicates can be used starting with the traditional ones everybody knows ("=","!=","<",">", …) to very specific ones such as LIKE, DATEADD, NULL, and the full-text predicates FREETEXT and CONTAINS. The CONTAINS predicate performs comparisons on columns that contain text. Based on the proximity of the search terms, matching can be performed on single words or phrases. In comparison, the FREETEXT predicate is tuned to match the meaning of the search phrases against text columns.

An example of a query using all of these options follows.

SELECT URL, Title, Description FROM SCOPE() WHERE "scope"='All Sites' 
   AND FREETEXT('gallery hinges') AND SITE = "http://supportdesk" 
   AND NOT CONTAINS('brass')

The last part of the SQL statement is the ORDER BY, which gives you the option to sort the query results based on the value of one or more columns specified.

The New Search Query Classes

The new managed API to support the execution of an Enterprise search query is encapsulated within the Microsoft.Office.Server.Search.dll and defined within the Microsoft.Office.Server.Search.Query namespace. Two classes are important here. Both inherit from the Query class, an abstract class that is not intended to be used directly in code. The two classes are:

  • KeywordQuery, dedicated to executing a query defined with the keyword syntax

  • FullTextSqlQuery, used when the search query is defined by using the Enterprise SQL search query syntax

Important to note is that both classes use the same flow for the execution of the query. They both expose properties to assign possible configurations of options and the query that has to be executed. The Execute method executes the query and returns the search results in the same manner as a set of IDataReader objects. All the members required to support this flow are defined at the level of the Query class. I'll quickly discuss the most important members of this class.

Query Class

Most of the properties at the level of the Query class, summarized in Table 3-2, are there to help you prepare the execution of the search query.

Table 3-2. Query class properties

Property Description

Culture

The culture to use for stemming, noise words, thesaurus, and more.

EnableStemming

Turn stemming on or off.

IgnoreAllNoiseQuery

Allow or do not allow a query consisting only of noise words.

KeywordInclusion

Indicates whether the query results contain all or any of the specified search terms.

QueryText

The most important property since this is the one used to assign the query string to be executed.

ResultTypes

Defines the result types to be included in the search results.

RowLimit

The maximum number of rows returned in the search results.

StartRow

Gets or sets first row included in the search results.

TrimDuplicates

Gets or sets a Boolean value that specifies whether duplicate items should be removed from the search results.

The Execute method is used to execute the query assigned to the QueryText property. The results of the query are returned as an object of type ResultTableCollection, a collection of IDataReader objects to be processed one by one.

KeywordQuery and FullTextSqlQuery Classes

Both classes are instantiated in the same manner. Since the objects execute a query against the search service, and the search service is part of the shared services, it is necessary to provide information about the Shared Services Provider you'll want to connect to. The constructor accepts an instance of the ServerContext object, an instance of an SPSite object, or the string of the search application name. The following code illustrates how to start with the KeywordQuery class.

ServerContext context = ServerContext.GetContext("SharedServices1");
KeywordQuery kwq = new KeywordQuery(context);

After this, it is a matter of assigning the values for one or more properties that prepare the query and that the classes inherit from the Query class. One of them, the ResultTypes, is set to indicate what types of result you are interested in. There are a few options here, summarized in Table 3-3.

Table 3-3. ResultType values

Value Description

DefinitionResults

Definitions for keywords matching the search query.

HighConfidenceResults

High-confidence results matching the search query.

None

No result type specified.

RelevantResults

The main search results from the content index matching the search query.

SpecialTermResults

Best bets matching the search query.

The QueryText property is of course a very important one to set. When you are working with a KeywordQuery class, the query is just a string containing the keywords. If you work with the FullTextSqlQuery class, it is a full-blown SQL statement. Here is an example of the first scenario.

kwq.ResultTypes = ResultType.RelevantResults;
kwq.EnableStemming = true;
kwq.TrimDuplicates = true;
kwq.QueryText = "business +isDocument:1";

Next, the Execute method is called, which returns an instance of the ResultTableCollection object.

ResultTableCollection results = kwq.Execute();

You can extract the type of results you'd like to process further. In the following sample, all of the relevant results are retrieved in the form of an individual instance of the ResultTable type.

ResultTable resultTable = results[ResultType.RelevantResults];

The rest of the process is of course dependent on what you want to do with the results. Here is an example that takes the ResultTable object, implements the IDataReader interface, and drops it into a DataTable that is then bound to a GridView.

DataTable tbl = new DataTable();
tbl.Load(resultTable, LoadOption.OverwriteChanges);
dataGridView1.DataSource = tbl;

If you use all of this code in a small Windows–based application you can let the users formulate a search query, execute it, and have the results displayed in a GridView. Don't forget to add a reference to System.Web.dll, Microsoft.Office.Server.dll, Microsoft.Office.Server.Search.dll, and Microsoft.SharePoint.dll.

Walkthrough: Building a Custom Search Web Part

There are many possible opportunities for using the KeywordQuery and FullTextSqlQuery classes in custom Web Parts. Since you've been exposed to the KeywordQuery class in the previous section, I'll focus now on the FullTextSqlQuery class. The walkthrough will describe the steps for building a Web Part that can be dropped on a site home page and show all the documents that have been added to SharePoint sites and indexed within the past week. Instead of building the Web Part the traditional way, you'll encapsulate all of the logic into an ASP.NET 2.0 user control and use SmartPart to make the user control available as a Web Part on a SharePoint page.

Note

SmartPart is a free community tool developed by Jan Tielens, who works for U2U. It is a generic Web Part that you can use to load ASP.NET 2.0 user controls and have them rendered as a Web Part. SmartPart is available for download at SmartPart for SharePoint. The installation is pretty simple and is documented on the site.

You can start by creating an ASP.NET 2.0 Web Site using Visual Studio 2005. In the project, you'll need one additional item on top of the default.aspx, which is already available. The additional item is a Web user control, an ASCX file named ThisWeeksDocuments.ascx. The only control needed is a Label named labelResults. We'll not bother much with the aesthetics.

There are three references that you'll have to add to the project: one for Microsoft.SharePoint.dll, another one for Microsoft.Office.Server.Search.dll, and a third for Microsoft.Office.Server.dll. Continue now by opening the code-behind class for the user control. The following using statements for the namespaces that you'll use in the code have to be added in addition to the ones already there.

using System.Text;
using Microsoft.Office.Server;
using Microsoft.Office.Server.Search.Query;

All of your code will be written in the Page_Load procedure that is already available. First, there is the declaration of a couple of variables that will be needed later. One of them contains the SQL query that will be executed.

string item = "<A href='{0}' />{1}</A> (authored by {2})<BR>";
StringBuilder sb = new StringBuilder();

string query = "SELECT url, title, author " +
              "FROM Scope() " +
              "WHERE \"scope\" = 'All Sites' AND " +
              "isDocument=1 AND write > DATEADD(Day, -7, GetGMTDate())";

The process for executing the query has been previously discussed. To start, you need an object of the type FullTextSqlQuery. It is possible that the ASP.NET 2.0 user control is used outside of the SharePoint context, such as on a traditional ASP.NET Web site. Therefore, you cannot assume that you'll have immediate access to the context, and you need to first create an object of type ServerContext and pass this instance as parameter during the construction of the FullTextSqlQuery object. Here is the code for it.

ServerContext context = ServerContext.GetContext("SharedServices1");
FullTextSqlQuery qry = new FullTextSqlQuery(context);

Next, there is the preparation of the query. You have to assign values for a number of properties. The query string is set as the value for the QueryText property.

qry.ResultTypes = ResultType.RelevantResults;
qry.EnableStemming = true;
qry.TrimDuplicates = true;
qry.QueryText = query;

After the preparation step, it is time to execute the query. Call the Execute method and store the results in a variable of type ResultTableCollection as shown in the following code.

ResultTableCollection results = qry.Execute();

Again, you're only interested in the actual search results and therefore you'll isolate these and extract them as follows.

ResultTable resultTable = results[ResultType.RelevantResults];

The rest of the code is all about processing the results. Remember that the ResultTable implements the IDataReader interface. The processing can be done by looping through all of the search results one by one and building up the string to be displayed in the label. Add the following code as the final piece of the user control.

if (resultTable.RowCount == 0)
{
    labelResults.Text = "No documents that were created today have been found";
}
else
{
    while (resultTable.Read())
    {
        sb.AppendFormat(item, resultTable.GetString(0),
              resultTable.GetString(1), resultTable.GetString(2));
    }
}

labelResults.Text = sb.ToString();
qry.Dispose();

It is possible to test out the code without running it in the context of Microsoft Office SharePoint Server 2007. Just drop the user control from the solution explorer on the default.aspx in design mode and browse this page. The result is not really fancy, but you should get the list of all of the documents added within the week to the SharePoint site crawled by the index engine.

Now take the following steps to make this ASP.NET 2.0 user control available as a Web Part on a SharePoint site. First, copy both the .ascx and the code-behind files in a folder called UserControls in the directory that is associated with the IIS Web application that hosts the SharePoint site and where the SmartPart is installed. If the UserControls folder is not there, you'll have to create it. Because you are not deploying the ASP.NET user control as a pre-compiled .NET assembly, you also have to add the following two elements as child elements of the assemblies element in the web.config file of the IIS Web application.

<add assembly="Microsoft.Office.Server, Version=12.0.0.0, 
    Culture=neutral, PublicKeyToken=71E9BCE111E9429C"/>
<add assembly="Microsoft.Office.Server.Search, Version=12.0.0.0, 
    Culture=neutral, PublicKeyToken=71E9BCE111E9429C"/>

Next on the SharePoint site, add the SmartPart on a page. If it is installed properly, you'll see it in the list of available Web Parts when you click the Add a Web Part button. Open the tool pane that shows the properties exposed by the SmartPart. At the top, there is a drop-down list that is populated with all of the .ascx files that the SmartPart has found in the UserControls folder. Select the one that you've copied to the folder and, after applying this setting, the user control will be displayed as a Web Part on the page. Figure 3-16 displays the end result.

Figure 3-16. The ASP.NET user control showing the search results, now in a SharePoint page hosted by SmartPart

ASP.NET user control showing the search results

The Query Web Service

Talking to the search service remotely is done by consuming the Query Web service that's exposed by Microsoft Office SharePoint Server 2007. When starting a new Visual Studio 2005 project, the first thing to do is to generate a proxy class. You accomplish this by adding a Web reference pointing to the following URL.

http://Server_Name/[sites/][Site_Name/]_vti_bin/search.asmx

The proxy class facilitates the communication with the Web service by encapsulating all of the low-level plumbing such as the required SOAP packaging of your requests.

Table 3-5 lists all of the methods that can be called for the Query Web service.

Table 3-5. Query Web service methods

Method Description

GetPortalSearchInfo

Returns the list of all of the search scopes.

GetSearchMetadata

Returns the list of all of the managed properties and the search scopes.

Query(System.String)

Returns a set of results in an XML string for the specified query.

QueryEx(System.String)

Returns a set of results in a Microsoft ADO.NET DataSet object for the specified query.

Registration(System.String)

Returns registration information for the search service used by Microsoft Office clients and Internet Explorer to register the Query web service as a Research service.

Status

Returns availability of the search service.

The moment you start an application, you can connect to the Web service sending the credentials of the currently logged on user.

searchService = new SearchService.QueryService();
searchService.Credentials = System.Net.CredentialCache.DefaultCredentials;

You might want to verify that the search service is up-and-running before attempting to execute queries. There is a method called Registration that returns an XML response containing the status information. Here is the call to that method.

string result = searchService.Registration(registrationString);

If all is well, you'll receive a status SUCCESS along with other information regarding the provider that delivers the search results. This information is used by, for example, Microsoft Office smart clients when you register the Query Web service as a Research service.

<ProviderUpdate xmlns="urn:Microsoft.Search.Registration.Response">
  <Status>SUCCESS</Status>
  <Providers>
    <Provider>
     <Id>{86051521-13de-4dac-80d1-231406b51b57}</Id>
     <Name>Microsoft Office SharePoint Server 2007 Search</Name>
     <QueryPath>http://moss/_vti_bin/search.asmx</QueryPath>
     <Type>SOAP</Type>
     <Services>
        <Service>
          <Id>{86051521-13de-4dac-80d1-231406b51b57}</Id>
          <Name>Home</Name>
          <Category>INTRANET_GENERAL</Category>
          <Description>This service allows you to search the site : 
                       Home</Description>
          <Copyright>Microsoft® Office SharePoint® Server 2007 Search</Copyright>
          <Display>On</Display>
        </Service>
      </Services>
    </Provider>
  </Providers>
</ProviderUpdate>

The most important methods are the Query method and the QueryEx method. The input you have to give for both is very similar: it is an XML string starting with the QueryText element containing the query formulated either with the keyword syntax or with the Enterprise SQL Search Query syntax. The format of the XML is described by the Microsoft.Search.Query Schema for Enterprise Search. In the application, you can test a keyword syntax query with both methods. As you'll notice, the QueryText element is nicely wrapped into a QueryPacket element so that it can be interpreted correctly by the Web service. Here is the full XML that is submitted to the Web service when executing a keyword syntax query.

<QueryPacket xmlns="urn:Microsoft.Search.Query" Revision="1000">
  <Query domain="QDomain">
    <SupportedFormats>
      <Format>urn:Microsoft.Search.Response.Document.Document</Format>
    </SupportedFormats>
    <Context>
      <QueryText language="en-US" type="STRING">sharepoint</QueryText>
    </Context>
  </Query>
</QueryPacket>

The XML compiled when you execute a query using the Enterprise SQL Search Query syntax is only different at the level of the value for the type attribute of the QueryText element, and of course the value of the QueryText element itself.

<QueryPacket xmlns="urn:Microsoft.Search.Query" Revision="1000">
  <Query domain="QDomain">
    <SupportedFormats>
      <Format>urn:Microsoft.Search.Response.Document.Document</Format>
    </SupportedFormats>
    <Context>
      <QueryText language="en-US" type="MSSQLFT">
         SELECT Title, Path, Description, Write, Rank, Size FROM SCOPE() WHERE 
         FREETEXT('office')
      </QueryText>
    </Context>
  </Query>
</QueryPacket>

The code for executing the queries is as follows.

[C#]

// Executing with Query method and getting the results in XML format back

string queryResults = searchService.Query(query);

// Executing the query with the QueryEx method and getting an ADO.NET DataSet back

DataSet resultDataset = searchService.QueryEx(query);

With the Query call, you'll receive a ResponsePacket containing the search results in XML format. There is a difference between a query created with the keyword syntax and a query created with a SQL statement. Using the keyword syntax, you do not have the option to specify what columns need to be included in the results. You also do not have the option to make use of the search scopes to limit the results or filter the results based on some kind of condition formulated with a search predicate. Here is the XML that's returned when executing a keyword syntax query.

ResponsePacket xmlns="urn:Microsoft.Search.Response">
  <Response domain="QDomain">
    <Range>
      <StartAt>1</StartAt>
      <Count>10</Count>
      <TotalAvailable>83</TotalAvailable>
      <Results>
        <Document relevance="893" xmlns="urn:Microsoft.Search.Response.Document">
          <Title>Home</Title>
          <Action>
            <LinkUrl size="0">http://moss</LinkUrl>
          </Action>
          <Description />
          <Date>2006-12-27T16:10:22-08:00</Date>
        </Document>

        <more elements of Document type>

      </Results>
    </Range>

    <Status>SUCCESS</Status>
  </Response>
</ResponsePacket>

The QueryEx call returns an ADO.NET DataSet serialized as an XML block. The proxy class immediately wraps this into an ADO.NET DataSet object so that you can bind it for example to a GridView control. The ResponsePacket for the QueryEx method has a number of additional properties that are returned, such as the information you'll be able to use for displaying the keywords highlighted in the description. These are described by the Microsoft.Search.Response and the Microsoft.Search.Response.Document schemas. Also, all types of search results are returned, such as the relevant results, keyword matches, and best bets. The XML typically looks like this.

<Results>
  <RelevantResults>
    <WorkId>72</WorkId>
    <Rank>893</Rank>
    <Title>Home</Title>
    <Author>LitwareInc Administrator</Author>
    <Size>0</Size>
    <Path>http://moss</Path>
    <Write>2006-12-27T16:10:22-08:00</Write>
    <SiteName>http://moss</SiteName>
    <CollapsingStatus>0</CollapsingStatus>
    <HitHighlightedSummary>Welcome to Microsoft® 
       &lt;c0&gt;Office&lt;/c0&gt; SharePoint® 
       Server 2007  &lt;ddd/&gt; Get started with the new 
       version of Microsoft 
       &lt;c0&gt;Office&lt;/c0&gt; SharePoint Server 2007:
    </HitHighlightedSummary>
    <HitHighlightedProperties>&lt;HHTitle&gt;Home&lt;/HHTitle&gt;&lt;HHUrl&gt;
        http://moss&lt;/HHUrl&gt;
    </HitHighlightedProperties>
    <ContentClass>STS_Site</ContentClass>
    <IsDocument>0</IsDocument>
  </RelevantResults>
</Results>

Executing a query using the full SQL syntax allows you to specify exactly what you want to have returned; you also have the option to use search scopes and conditions for filtering the search results. The ResponsePacket for a query executed with the Query method therefore includes—on top of the default elements that are returned—the various properties defined in the SELECT part of the SQL statement. For each of these, there is a Property element containing the name, type, and value of the property. Here is an extract from it.

<ResponsePacket xmlns="urn:Microsoft.Search.Response">
  <Response domain="QDomain">
   <Range>
    <StartAt>1</StartAt>
    <Count>10</Count>
    <TotalAvailable>77</TotalAvailable>
    <Results>
      <Document xmlns="urn:Microsoft.Search.Response.Document">
      <Action>
        <LinkUrl>http://moss</LinkUrl>
      </Action>
      <Properties xmlns="urn:Microsoft.Search.Response.Document.Document">
        <Property>
          <Name>TITLE</Name>
          <Type>String</Type>
          <Value>Home</Value>
        </Property>
      <Property>
        <Name>PATH</Name>
        <Type>String</Type>
        <Value>http://moss</Value>
      </Property>
     </Document>
    </Results>
   </Range>
   <Status>SUCCESS</Status>
  </Response>
</ResponsePacket>

Executing queries is not the only option you have with the Query Web service. There are two methods for returning metadata information that can be used when creating the query. This one returns the most information.

DataSet results = searchService.GetSearchMetadata();

The GetSearchMetaData method returns an ADO.NET DataSet with two tables in it. One contains the list of search scopes, and the other contains the list of managed properties.

A second method, the GetPortalSearchInfo, returns an XML string that only contains the search scopes.

string results = searchService.GetPortalSearchInfo();

Custom Security Trimming

A last topic for this chapter is the creation of custom security trimmers that can act on the search results before they are displayed to the user. By default, a user will not see the search results originating from the SharePoint sites or shared folders for which he or she does not have the appropriate permissions. The index engine, while crawling, collects and stores the information out of the access control list (ACL) defined for the resource or container where the resource is stored. While searching, this information is being used by the search service, thus filtering the search results. But this is not the case for all of the different types of content sources administrators can create. For example, there is no security trimming for the results coming out of normal, non-SharePoint Web sites or for the business data indexed. It might also be a business requirement to execute more than what the index engine does by default. In the walkthrough, for example, you'll work out a scenario where administrators can store GUIDs of list instances that they do not want to appear in the search results for a specific user. The list of excluded list instances can be part of the profile of the user. You can think of many other types of specific business rules that might need to be applied before the search results are displayed to the user. That's exactly what a custom security trimmer allows you to do. We'll first have a look at some of the basics and then proceed to the walkthrough.

The ISecurityTrimmer Interface

The ISecurityTrimmer interface is defined in the Microsoft.Office.Server.Search.Query namespace and located in the Microsoft.Office.Server.Search.dll. There are two methods that have to be implemented: the Initialize method and the CheckAccess method.

The Initialize Method

This method is called at the time that the .NET assembly is loaded into the worker process. This means that the code written here is executed only once. It is typically the place where you specify values for some of the configuration properties that are defined at the time you register the custom security trimmer using the stsadm command line utility (discussed more later). These properties can then be used in the CheckAccess method that is called for each of the search results. For example, for performance reasons, you might want to limit the amount of crawled URLs to be processed by your custom security trimming code. Think of it, when performing a search query, thousands, even millions, of search results can be returned. If you are going to apply custom business logic for each of these search results before they get displayed to the user, you'll risk slowing down the whole process. You might want to let the administrator set a limit on the amount of URLs that need to be checked and then use that information before you execute the business logic. If there are too many URLs to check, you might want to notify the user that the search query needs to be refined.

The CheckAccess Method

This method is where all of your business logic that will trim the search results ends up. It is typically called each time a query is performed. All of the crawled URLs are passed as an instance of a System.Collections.Generic.IList. It is possible that the method is called multiple times per search session depending on the amount of search results returned. This is why you might want to limit the amount of results, as discussed previously. The second parameter that is passed to the method contains information that can be used to track information across multiple CheckAccess method calls for the same search query.

A typical flow within the method is to first retrieve information about the currently logged on user and any additional information as needed, such as the associated user profile values. All of this can be used while the search results are processing, and you have to tell Office SharePoint Server to display the result, yes or no, in the results page. For that reason, you'll notice a return value of type System.Collections.BitArray that contains a true or false value for every search result. Your job is to set this flag in the loop based on the outcome of your business logic.

Let's see all this in action.

Walkthrough: Building a Custom Security Trimmer

The goal of this walkthrough is to show the steps for building a small custom security trimmer that filters the search results before they are displayed, excluding the results of a list for which the GUID has been defined as part of the profile of the user performing the search. You can start by building a new class library project called CustomSecurityTrimmerDemo in Visual Studio 2005. InsideMOSS.Samples is the namespace I'll be using for all of the types included in the assembly. Next, you'll need a couple of references to .NET assemblies that are required for the execution logic. The first one is the Microsoft.Office.Server.dll, because you're going to access the user profiles programmatically, and the second one is the Microsoft.Office.Server.Search.dll. There are a couple of using statements that are needed in addition to the ones already there for the class in this project.

using System.Collections.Specialized;
using System.Security.Principal;
using Microsoft.Office.Server;
using Microsoft.Office.Server.Search.Query;
using Microsoft.Office.Server.Search.Administration;
using Microsoft.Office.Server.UserProfiles;

A custom security trimmer class has to implement the ISecurityTrimmer interface and two methods: the CheckAccess method and the Initialize method. You'll use only the first one, but you have to make sure that the Initialize method returns something else instead of an exception. Here is the full class ready for the business logic that makes up the custom security trimming.

namespace InsideMOSS.Samples
{
    public class CustomTrimmer : ISecurityTrimmer
    {
        public System.Collections.BitArray CheckAccess
            (IList<string> documentCrawlUrls, 
             IDictionary<string, object> sessionProperties)
        {
        }

        public void Initialize(NameValueCollection staticProperties, 
             SearchContext searchContext)
        {
        }
    }
}

The CheckAccess method returns an instance of a BitArray type that simply contains a true or false value per search result that is returned by the search service. True means that it will be displayed to the user, while false means, of course, that it will be hidden from display. It is your job now to code your business rule and toggle these flags on or off depending on your scenario. Let's prepare the return value.

BitArray retArray = new BitArray(documentCrawlUrls.Count);
for (int x = 0; x < documentCrawlUrls.Count; x++)
  {
     retArray[x] = true;
  }
return retArray;

At this point, all of the results will be displayed. Inside the for loop, however, you'll first find out who the user is and retrieve his or her user profile. The following code shows how to do this.

string user = WindowsIdentity.GetCurrent().Name;
UserProfileManager profileManager = new UserProfileManager(ServerContext.Current);
UserProfile profile = profileManager.GetUserProfile(user);

The user profile can be extended with one additional custom property called ExcludedListsFromSearchResults that will store one or more GUIDs that identify the lists and the content inside it. You want to exclude the user profile list from the search results. You'll have to use the administration site of your Shared Services Provider to create that user profile property. Refer back to Chapter 2, "Collaboration Portals and People Management," for an explanation of how to do this. Once the property is there, you can retrieve the value. If this is a multi-valued property, you'll have to write a small loop to process the array of GUIDs.

if (profile["ExcludedListsFromSearchResults"].Value != null)
{
    string[] excludedLists = 
         profile["ExcludedListsFromSearchResults"].ToString().Split(';');
}

The search results are processed one by one within the main for loop, and each of them is accessed by the first incoming parameter called documentCrawlUrls. This parameter contains an array built up out of strings, each of them representing the information concerning one of the search results. For a result indexed with the sts3 protocol handler, that is the Windows SharePoint Services 3.0 protocol handler, the string resembles the following.

sts3://moss/siteurl=/siteid={86051521-13de-4dac-80d1-231406b51b57}
/weburl=/webid={e67908fc-024b-4689-b854-4837d0d32abb}
/listid={30dc794a-c5b6-4c86-97f6-2ed65133149d}/viewid={efc86211-
96e2-4443-9dfc-5d8e42daf285}

The string is not delimited, but you can retrieve the listID out of the string by using the following lines of code.

if (documentCrawlUrls[x].StartsWith("sts3") && 
    documentCrawlUrls[x].Contains("/listid="))
{
    int pos = documentCrawlUrls[x].IndexOf("/listid=");
    string listID = documentCrawlUrls[x].Substring(pos + 8, 38);
}

The only task left to finish the job is to check whether the GUID of the list in the listID variable is part of the list of GUIDs in the custom user profile property. Add these lines of code after the listID variable declaration.

foreach (string item in excludedLists)
{
  if (item.ToUpper() == listID.ToUpper())
  {
    retArray[x] = false;
  }
}

If there is a match, it means you do not want SharePoint to display the search results. Therefore, set the flag for that processed search result to false. In the real world, you'll probably add more meat to this code by encapsulating all of it in the proper exception handlers and perhaps adding more business logic to it. But for this walkthrough, we've done enough. Now, let's prepare the .NET assembly for deployment. Since the DLL has to be deployed in the global assembly cache, you'll have to sign it. Adding the assembly to the global assembly cache can be done by using the Global Assembly Cache tool (Gacutil.exe) after you've built the assembly.

gacutil /i CustomSecurityTrimmerDemo.dll

You have to register the .NET assembly as a custom security trimmer in order for SharePoint to take custom security trimmers into consideration. Before doing that, make sure you have a crawl rule defined that covers one or more of the content sources that are indexed. The following is the call to the stsadm command line utility that will register the trimmer.

stsadm -o registersecuritytrimmer -ssp SharedServices1 -id 1 
  -typeName InsideMOSS.Samples.CustomTrimmer, CustomSecurityTrimmerDemo, 
Version=1.0.0.0,Culture=Neutral, PublicKeyToken=4c04f619a71116ef" 
-rulepath http://moss/*

Nothing really happens unless you restart IIS with a call to iisreset from the command line and—don't forget—you must also run a full crawl for the content sources that the security trimmer is targeting and the crawl rules for which the trimmer has been activated.

Test your trimmer by making sure you have a GUID of a SharePoint list as a value of the user profile property. When searching, none of the search results should contain resources from that list.

Summary

In this chapter, I've discussed plenty of topics that should have interested you if you enjoy search functionality. First, there was the discussion of the new search administration object model with plenty of opportunities to automate certain tasks that administrators normally perform while managing the Shared Services Provider Web site. You've also learned about options for customizing the Search Center and about the different Web Parts that, together, deliver the search experience to the user. I've also described the new classes to use when you want to programmatically execute the Enterprise search queries either directly on the machine where Office SharePoint Server 2007 is installed or remotely by using the Query Search Web service. In the final part of the chapter, I explained the new additional security trimming layer that enables you to filter the search results based on custom business logic or security restrictions that you want to enable for a specific user or a group of users.

About the Author

Patrick Tisseghem (MVP Office SharePoint Server) is managing partner at U2U, a .NET training and services company based in Brussels, Belgium. He works most of his time as a contracted trainer for Microsoft Corporation and EMEA delivering presentations, workshops, and seminars addressing most of the information worker products and technologies. Patrick is author of a number of articles and whitepapers on SharePoint development aspects on MSDN and is author of a book titled Inside MOSS 2007 for Microsoft Press.