Discardable Properties for Your Web Pages in Internet Explorer 4.0 and 5.0

 

Matt Oshry
Michael Edwards
Microsoft Corporation

Updated June 22, 1998 (including a new section on persistence features introduced for Internet Explorer 5)
Updated December 30, 1997 (including a new discussion of security issues and code signing)
Originally posted November 1997

Contents

Introduction
How Does It Work?
How Do I Use It?
How Do I Make My COM Object Cacheable?
How Does the PropertyCache Sample Work?
Hey, I'm No C++ Programmer, Just Show Me How to Script it!
Just Give Me the Samples Already!
Summary
Internet Explorer 5 Does Internet Explorer 4.0 Discardable Browser Properties (and more)

Introduction

Added June 22, 1998: Microsoft Internet Explorer 5 introduces a new set of persistence features that provide all the Internet Explorer 4.0 discardable-properties functionality discussed in this article. These features are not, of course, provided on Internet Explorer 4.0, so you may still find this article useful for the Internet Explorer 4.0 browser. And if you implement support for Internet Explorer 4.0 discardable browser properties, your efforts will work just fine in Internet Explorer 5. However, if you can target Internet Explorer 5, there are some pretty cool advantages to using its new persistence features (you may even decide to target both browsers). Click here for a discussion that compares the new persistence features in Internet Explorer 5 with the functionality provided in discardable properties for Internet Explorer 4.0.

Microsoft Internet Explorer 4.0 includes a new feature to save temporary data from one Web page, and access it from any other page, within the same browser window. Multi-page processes (such as registration or online shopping) often require information that was entered by the user on previous pages, but that doesn't need to be available longer than one browser session. The new IDiscardableBrowserProperty interface allows data to be contained in a very lightweight object that is automatically discarded when the browser window is closed, or when the object has not been accessed for over ten minutes.

This article begins by describing how IDiscardableBrowserProperty works, then goes on to explain how you can enhance an existing COM object to support it. After that, for C++ developers we describe some important implementation details for the PropertyCache control we've provided. If you aren't the COM type and just want to use this new feature from script, don't despair. After reading the next section about how this feature works, just skip that scary COM stuff, and learn how to cache data in HTML scripts using our PropertyCache control.

How Does It Work?

Internet Explorer 4.0 includes the new IWebBrowser2 interface for adding Internet browsing functionality to your application. If you've been around the block a couple of times, you might remember that the old Internet Explorer 3.xIWebBrowserApp interface was replaced by IWebBrowser2. IWebBrowserApp included the PutProperty method to cache a property that can be retrieved later with GetProperty. Let's look at their function prototypes right here:

HRESULT PutProperty(BSTR szProperty, VARIANT vtValue);
HRESULT GetProperty(BSTR szProperty, VARIANT FAR* pvtValue);
// szProperty is a caller-allocated buffer
// with a unique property name
// vtValue is a VARIANT value associated with the given property

As you can see, since Internet Explorer 3.0 you've had the ability to store arbitrary data (VARIANT objects) as properties of a browser window. But you may not know (unless you have a pretty good testing staff) that this mechanism is not very useful for ActiveX controls or documents, because the Internet Explorer 3.0 browser does not have the capability to void these properties after a period of time. Therefore, stored properties hang around until the browser window is closed, which might be a long time (while the client's available memory slowly dissipates).

The IWebBrowser2 implementation for GetProperty and PutProperty fixes this problem by making the stored property discardable. All properties that are associated with a browser window are regularly scanned for discarding. A property is considered discardable if the page that stored it has been unloaded, and more than 10 minutes has elapsed since the property was last accessed. Properties associated with a given browser window are not applied to other windows. Of course, all the properties associated with a given browser window are discarded when the window is closed.

To take advantage of this enhanced property cache feature, you need to package the data into a COM object that supports the IDiscardableBrowserProperty interface. That's because IWebBrowser2::PutProperty calls QueryInterface on the passed VARIANT object for an IID_IDiscardableBrowserProperty interface. If the call is successful (the interface is supported), the browser will consider the object to be discardable.

How Do I Use It?

If you want to do something as simple as remembering whether the user has logged on, or whether they clicked something on a previous page, some scripting examples are probably what you're most interested in seeing. However, if you have something more complicated in mind, you may need to incorporate discardable functionality directly into objects you want to cache for other pages. So, let's consider both perspectives, starting with the person who is just aching to store data for later retrieval.

How Do I Make My COM Object Cacheable?

Storing any COM object as a discardable property of the browser window is easy. The object needs only to update its implementation of IUnknown to respond to a QueryInterface for IID_IDiscardableBrowserProperty by returning its IUnknown pointer.

We're going to explain how to do this by using Active Template Library (ATL) code examples. If you create a simple COM object using ATL, the interface map in your class declaration (which specifies how ATL should implement your QueryInterface method) looks like this:

BEGIN_COM_MAP(CYourData)
COM_INTERFACE_ENTRY(IYourData)
END_COM_MAP( )

If you want to be able to store an instance of CYourData as a discardable property of the browser window, all you have to do is modify the interface map to handle queries for the IID_IDiscardableBrowserProperty interface:

// shlguid.h defines the IID_IDiscardableBrowserProperty GUID
#include <shlguid.h>

BEGIN_COM_MAP(CYourData)
COM_INTERFACE_ENTRY(IYourData)
// tell ATL to return our IUnknown
// when asked for the discardable interface
COM_INTERFACE_ENTRY_IID(IID_IDiscardableBrowserProperty, CYourData)
END_COM_MAP( )

Then, to cache an instance of CYourData, you just need to create a VARIANT object to point to your object, give it a unique property name, and store it:

#include "stdafx.h"  // VARIANT, V_* and other OLE goodies
#include <exdisp.h>   // IWebBrowserXXX

static BSTR szName = "a GUID created uniquely for this property";

HRESULT SaveIt(IYourData * pData, IWebBrowser2 * pWebBrowser) {

 VARIANT vCache;
 LPUNKNOWN pUnknown;

// set the VARIANT's type to an IUnknown pointer
 V_VT(&vCache) = VT_UNKNOWN;
 // make the VARIANT point at our object
 pData->QueryInterface(IID_IUnknown, \
 (LPVOID*)&pUnknown);
 V_UNKNOWN(&vCache) = pUnknown;

 // IWebBrowser2 will QI vCache for the discardable
 // interface, and if successful automatically retire
 // it if it's not accessed for a while
 return pWebBrowser->PutProperty(szName, vCache);
}

To get the object back later:

HRESULT GetItBack(IYourData * pData, IWebBrowser2 * pWebBrowser) {

 VARIANT vCache;
 HRESULT hr;

 // retrieve the VARIANT we stashed
 hr = pWebBrowser->GetProperty(szName, &vCache);

 // get our object out of it
 if (SUCCEEDED(hr) && V_VT(&vCache) == VT_UNKNOWN)
 hr = V_UNKNOWN(&vCache)->QueryInterface(IID_IYourData,\
 (LPVOID*)&pData);

VariantClear(&vCache);

 return hr;
}

OK, we're cheating just a bit here. You cynical types are wondering how the heck you're supposed to come up with a pointer to the IWebBrowser2 object for your browser window. That's a good question. There's an easy answer, too, and we'll use ATL to demonstrate.

In addition to providing objects with their implementation for IUnknown, ATL provides a default implementation of the IObjectWithSite interface, IObjectWithSiteImpl, which provides an object with a pointer to its activation site (in this case, the browser window). To use ATL's built-in implementation of IObjectWithSiteImpl, you simply add the implementation to the base class list for your object, and add an entry to the interface map to expose the implementation's interface via QueryInterface.

For example, if CYourObject is an object that you are loading on your Web page via an <OBJECT> tag, and you want to be able to get a pointer to the IWebBrowser2 object, do the following in your class declaration:

class CYourObject :
 // other base classes for CYourObject go here
 public IObjectWithSiteImpl<CYourObject>
{
...
BEGIN_COM_MAP(CYourObject)
COM_INTERFACE_ENTRY(IYourObject)
// interfaces for other base classes in CYourObject go here
COM_INTERFACE_ENTRY_IMPL(IObjectWithSite)
 END_COM_MAP( )
...
};

After the browser loads your object, IObjectWithSiteImpl provides the m_spUnkSite member variable, which can be used to obtain a pointer to the IWebBrowser2 interface for your browser window:

#include <servprov.h> // IServiceProvider

HRESULT GetWebBrowser(CWebBrowser2 * pWebBrowser2)
{
 if (!m_spUnkSite)
  return S_FALSE;

 // CComQIPtr is an ATL helper macro that declares spSP
 // as a pointer to IServiceProvider (and takes care of
 // reference counting for us, too!); go to
 // msdn/sdk/inetsdk/help/compdev/ref_comobj/iserviceprovider.htm
 // for IServiceProvider docs
 CComQIPtr<IServiceProvider, &IID_IServiceProvider> \
 spSP(m_spUnkSite);

 return spSP->QueryService(IID_IWebBrowserApp, IID_IWebBrowser2, \
 (LPVOID*)& pWebBrowser2);
 }

Typically, SaveIt, GetItBack, and GetWebBrowser would all be methods in the CYourObject class, and CYourObject would create the instance of CYourData that encapsulated the data you wanted to cache. You could then use scripting events for page exit (onbeforeunload) to save your object, and page enter (onload) to reload the object on a subsequent page that needed the saved information.

If you want to look at some sample code that actually does all this, download the PropertyCache sample that is included with this article.

How Does the PropertyCache Sample Work?

There's nothing like decent sample code to help you get something working, so we included the source code for the PropertyCache control with this article. But before you go off and start changing the code, let's go over some important caveats.

First, to build this project you need the Internet Client SDK installed on your computer. Also, make sure the libs and includes from the Internet Client SDK are searched in front of those supplied elsewhere (in Developer Studio you can do this from Tools, Options, Directories).

If you read the above section on how to modify an ATL object to masquerade as a discardable browser object, then the source code in the PropertyCache project will look very familiar. There are two classes in PropertyCache. CDiscardable is a bare-bones COM object, an implementation of IUnknown that supports the IID_IDiscardableBrowserProperty interface and a single VARIANT property (similar to the CYourData class from the above example code). CPropertyCache is also a bare-bones COM object with methods that accept and return a VARIANT. Since CPropertyCache inherits from IObjectWithSiteImpl, it goes on your Web page (and thus is similar to the CYourObject class from the above example code).

The PropertyCache project builds a COM server named PCACHE.DLL. When PropertyCache is loaded on your Web page (via an <OBJECT> tag), the browser creates a CPropertyCache object. You can invoke its CacheData method to save script data, and RetrieveCachedData to get the data back later. When your script calls CacheData, the control dynamically creates a CDiscardable object, sticks the passed VARIANT item into it, and hands it off to the browser window (via PutProperty on the IWebBrowser2 object). The browser window increments the reference count for the discardable object, so when the PropertyCache control releases its own reference, the browser window is left with the only remaining reference. The browser window keeps track of when the object is accessed (via GetProperty on the IWebBrowser2 object called by RetrieveCachedData), and every time a page completes loading it checks whether any objects can be discarded. If a discardable object has not been accessed for 10 minutes it is discarded. This means that if your script caches some data from one page when it unloads, and accesses it from another page, you'll just need to finish loading the second page within 10 minutes of leaving the first page. You can't modify the 10 minute parameter.

What If I Need to Modify the PropertyCache Sample?

If you plan to modify the PropertyCache source code in order to use a variation of it on your own pages, you will need to make three easy but important changes to make sure your control is unique and secure.

First, you'll need to create your own GUID for the property name. Why? Security. If you don't change the GUID, anyone who reads this article and wants to hack your data will be able to by stripping off my GUID identifier. The GUID for the property name is defined in the file PageAcc.CPP:

// The gc_szKey string provides a unique property name.
// This GUID is concatenated to the property name passed
// to the CacheData method to ensure that property names
// are secure.
// Naturally, anybody reusing this sample code will want
// to change this GUID by generating one of their own
// (use the GUIDGEN.exe utility that ships
// with Developer Studio).
const OLECHAR gc_szKey[] = OLESTR("{your_GUID}");

Note this isn't the actual GUID for gc_szKey used in the downloadable, code-signed PropertyCache control included with this article. Using a secret value for that GUID ensures that the property names used in your script can't be viewed by a hacker intent on stealing cached data.

Second, you need to change the GUIDs used to identify the classes, interfaces, and type library for the PropertyCache control. Why? The browser distinguishes between these control elements by their IDs. If your control uses the same IDs as some other control that is already on the local computer, the browser will get confused. These five GUIDs are identified by the uiid() macros in the pcache.IDL file, such as this one for the IPropertyCache interface id:

[
 object,
 uuid(4F157AE1-3F9A-11D1-9E78-00AA00BBF119),
 dual,
 helpstring("IPropertyCache Interface"),
 pointer_default(unique)
]
interface IPropertyCache : IDispatch
{
 [id(1), helpstring("Cache some arbitrary data")]
 HRESULT CacheData([in] BSTR bstrPropName, [in] VARIANT vData, \
 [in, optional] long lSecurityLevel);
 [id(2), helpstring("Retrieve some arbitrary data")]
 HRESULT RetrieveCachedData([in] BSTR bstrPropName, \
 [out, retval] VARIANT* pvData);
};

What If I Want to Use the PropertyCache Control from C++?

If you want to write C++ code to use the PropertyCache object as is, then you'll need to include the PCACHE.H header file. The pcache.DLL file will be a dependent DLL to your code, so you will have to make sure the pcache.DLL library is part of your download package, and that the PropertyCache object gets registered.

The following ATL code will create a PropertyCache object and store a property with the default security setting:

#include "stdafx.h"
#include "pcache.h"
...
CComPtr pCache;
VARIANT vCache;
// presumably you actually have something to save, but we'll init this to empty
V_VT(&vCache) = VT_EMPTY;
// create a PropertyCache object
hr = CoCreateInstance(CLSID_PropertyCache, NULL, CLSCTX_INPROC, IID_IPropertyCache, (void**)&pCache);
if (pCache)
{
  // cache the data with default security
  pCache->CacheData(L"aUniquePropertyName", vCache, 3);
}
VariantClear(&vCache);

...

Then, to retrieve the property later, do this:

#include "stdafx.h"
#include "pcache.h"
...
CComObject* pCache = new CComObject;
if (pCache)
{
   VARIANT vCache;
   pCache-RetrieveCachedData(L"aUniquePropertyName", &vCache);
   pCache-Release();
}
// RetrieveCachedData() copies the data into vCache
// (so free it when you're done)
...

Hey, I'm No C++ Programmer, Just Show Me How to Script it!

If you're like many Web authors, you like to do as much as you can with HTML and script (especially when somebody is offering free sample code). In that case, you'll really appreciate the PropertyCache sample we wrote to give away with this article. With this sample, you can save data stored in script variables on your Web page, and access the data from another page in that browser window. We included the source code for the PropertyCache sample for you C++ programmers, but the rest of you (who just want to script it), can use the version we already built and code-signed. To use the PropertyCache control, insert this HTML in the body of your HTML document:

<!-- The PropertyCache control -->
<OBJECT CLASSID="clsid:68A12883-7584-11d1-A259-00C04FD97350"
   HEIGHT=0 WIDTH=0 ID=oCacher
   CODEBASE="pcache.cab#Version=1,0,0,0">
</OBJECT>

Note the CODEBASE parameter refers to a CABinet (.CAB) file. We code-signed the PropertyCache control with a Microsoft digital signature, so the control is actually inside pcache.CAB. Since Microsoft is code-signing the PropertyCache control, when your viewers are asked whether they should download it, they'll know the control was developed by Microsoft. Hopefully, knowing this will give your customers a profound sense of reassurance! The pcache.CAB file is included in the HTML sample downloads below.

While we are on the subject of security, let's talk about why this control is very secure. First, the PropertyCache control doesn't access any resources on the customer's local computer. Thus, there is no way that a malicious Web author can misuse this control to illegally access your local resources. But in addition to protecting your customer's local resources from hackers, you also need to protect the data that you cache using the PropertyCache control. For example, if you are going to use this control to implement a shopping cart on a commerce page, you probably don't want some other page to be able to peek into that shopping cart and see what items somebody is purchasing. For this reason, you must include a security parameter when you cache data using the PropertyCache control. This parameter indicates how strictly you want to limit access to the data, including whether you will allow the data to be accessed by any page in your domain, by any page in your virtual root, or only by the page that cached the data in the first place (this last, most restrictive setting, is the default). Under no circumstances will a page that is not on your domain be able to access the data.

From a performance standpoint, it is more efficient to add a single property to the browser than it is to write many properties. So, if you have a lot of stuff to save, you should put it all in an array variable and cache the array (although you can also cache any JavaScript variable or object). Typically, your script will accumulate data from user interaction with the page, and then cache it on page exit:

<SCRIPT for=window event=onbeforeunload language=javascript
 // Security == 1 // allow any page on your domain to read the data
 // Security == 2 // allow any page in your vroot to read the data
 // Security == 3 // allow only this page to read the data
 var Security = 2;
 aList = new Array();
 populate(aList);
 oCacher.CacheData("aUniquePropertyName", aList, Security);
</SCRIPT>

A subsequent page that needs previously cached data would then access it while the page is loading:

<SCRIPT for=window event=onload language=javascript
 var aList = oCacher.RetrieveCachedData("aUniquePropertyName");
 if (aList == null) {
 // must've been  10 minutes since the data was stored
 // or last accessed
 }
</SCRIPT>

CacheData() and RetrieveCachedData() are the only methods for this control, and there are no events or properties. Pretty simple, huh?

Just Give Me the Samples Already!

Download and play with a sample that keeps track of whether a user has already logged in to your Web site. The sample includes the signed PropertyCache control that you can reuse on your own pages.

Or, you can download and play with another sample that uses an array variable to keep track of products a user wants to buy from your store. This sample also uses the Tabular Data Control to manipulate and present shopping items from a comma-delimited list. Remember that you need to be using Internet Explorer 4.0 for these samples to work!

Summary

Even though today's Web pages are often offered in the context of a larger site, they tend to be pretty stand-alone views of information. That is, what a user does on one page in a given site doesn't impact what they see or do in other pages on that site. However, in cutting-edge sites like Microsoft Expedia, you are beginning to see multiple Web pages offered as a single unit that share important information and events. In this context, what a user does on one page has very much to do with what they see and do on other pages in that site. Thus, with the discardable browser property feature, Internet Explorer 4.0 introduces the next generation in data-sharing functionality that is needed to turn an unconnected set of Web pages into something more akin to an integrated application.

Internet Explorer 5 Does Internet Explorer 4.0 Discardable Browser Properties (and more)

Added June 22, 1998:

There are two main ways the discardable browser property feature exposed by Internet Explorer 4.0 is used by people who have read the preceding part of this article. For each of these two uses we are going to explain how you can get that same functionality by using the new persistence features in Internet Explorer 5.

Saving the State of HTML and Script on a Page

The first category is utilized by people who download the free PropertyCache sample that comes with this article in order to use it for persisting state information about the HTML and script on their Web pages. This is useful when you need to retrieve that state information from another page on your domain. This particular use is demonstrated in the two HTML samples included above.

Of course, if page state can be retrieved from any other page, you can also use this feature to save the state of a particular page so that it can be restored if the user returns to that same page. The new Internet Explorer saveHistory Behavior is perfect for this functionality. Saving the state of an HTML tag is as simple as adding a "CLASS=saveHistory" attribute to the tag, while script variables or script objects are saved by implementing load and save-event handlers and defining and setting attribute names for each value or object to save. Like Internet Explorer 4.0 discardable browser properties, this state is persisted only during the current session (that is, while the browser that navigated the persisted page is open).

While page state saved using the saveHistory Behavior can only be retrieved by that same page, state that is saved with the new userData Behavior can be retrieved by any other page on the domain. Unlike Internet Explorer 4.0 discardable properties, which can't persist beyond the duration of the current session, state that is saved with the userData behavior can be persisted beyond the current session. So you can save information about your page for retrieval the next time the user starts up Internet Explorer (or any other container for the WebBrowser control).

Another advantage to using both the saveHistory and userData behaviors is the ability to access the XML object model to create hierarchical data structures to represent your state information. This can be especially useful if you are saving a lot of data.

Finally, here's a really cool advantage to using Internet Explorer 5 persistence behaviors: you don't have to include the PropertyCache ActiveX control on your page since these features are built directly into the WebBrowser control!

Saving the State of ActiveX Controls on a Page

Using Internet Explorer 4.0 discardable browser properties to save the state of ActiveX controls on a page is simple to do, and actually not any different than saving any script object on the page. You merely pass the name you defined in the ID attribute of the OBJECT tag for that control to the CacheData method of the PropertyCache control. Likewise, you can use the saveHistory or userData behaviors to save ActiveX controls on a page (you just need to add the appropriate CLASS= attribute to the OBJECT tag for the ActiveX control you want to persist).

You can also save the state of your ActiveX control using Internet Explorer 4.0 discardable browser properties without using the PropertyCache control. In fact the information you need to do that is included in the rest of this article - you just have to modify your existing ActiveX control to directly incorporate the IDiscardableBrowserProperty interface. If you do that, you must choose whether to continue to use this mechanism on Internet Explorer 5, or reduce size and complexity by removing this interface and using the standard mechanism described above to persist your controls for Internet Explorer 5 browsers.

What If I Can't Author Exclusively for Internet Explorer 5?

If you can't target Internet Explorer 5 exclusively, you may still need to use the discardable-properties feature for Internet Explorer 4.0. Since discardable browser properties are completely supported by Internet Explorer 5, your decision is not muddled by compatibility issues. However, as you can see, the persistence functionality available in Internet Explorer 5 is quite compelling. Plus you don't have to include an ActiveX control on your page to access these persistence features. As Internet Explorer 5 becomes more widely adopted, you may decide to sniff the browser and handle both versions of Internet Explorer.

Is There Anything Internet Explorer 5 Persistence Does Not Provide?

Two things. The first is the ability to directly incorporate the persistence functionality in your custom ActiveX control. This means implementing IDiscardableBrowserProperty yourself, so you don't have to include the PropertyCache control on your page. However, as I already pointed about, you don't need to do this to persist an ActiveX control in Internet Explorer 5. The second has to do with security. Which leads us to...

Is Internet Explorer 5 Persistence as Secure as Internet Explorer 4.0 Discardable-Browser Properties?

As explained earlier in this article, Internet Explorer 4.0 discardable-browser properties are not secure at all! However, accessing these discardable properties through the PropertyCache control is secure because we added the security elements to the sample control itself. This is automatically done for you in Internet Explorer 5—all state information is secured to the domain itself, so only pages located on the same domain as the page which cached the information are allowed to retrieve it. The only difference is that in addition to securing the data to same-domain access, the PropertyCache sample control also implements the ability to secure the data to the vroot of the creating page.

So get busy!