Using the RTC Client API for Scalable Applications

Article
06/30/2006

Microsoft Corporation

October 2003

Applies to:
    Microsoft® Real-Time Communications Client API version 1.2
    Live Communications Server
    Instant Messaging applications

Summary: The Microsoft Real-Time Communications (RTC) Client application programming interface (API) version 1.2 is very efficient under most circumstances. However, when multiple clients are serviced with the same computer, providing optimal scalability is an important consideration in the application's design. This paper discusses techniques you can incorporate into RTC Client API applications to improve their scalability. (18 printed pages)

Introduction
Application Scenarios
RTC Client API Architecture Considerations
Scalability Techniques
Scenarios
Conclusion
Related Links

Introduction

It's critical that organizations planning large deployments of real-time communications applications ensure that those applications can scale to meet the desired goals. The RTC Client API is very efficient for client class applications for which each client runs on its own computer. To build a scalable RTC Client API application that services multiple clients with a single computer, you need to ensure that the application is scalable when you design it.

This paper covers techniques available in the RTC Client API that can improve the scalability of RTC Client applications. It begins by describing the types of applications that you can build by using the RTC Client API and then moves into considerations that you need to take into account when building scalable RTC Client applications. Next, the paper addresses techniques that can be used with the RTC Client API to improve scalability. Finally, the paper details three sample applications in terms of how they implement scalability.

Application Scenarios

There are two classes of applications: client and server. Client class applications have one real-time client per computer, such as the traditional instant messenger (IM) application. Server class applications typically act on behalf of multiple users or communicate with many hundreds of users simultaneously.

Server class applications are often based around intelligent applications that interact with users. These applications are also known as automatic robots or simply bots.

Bot-based applications can be divided into two categories: notification bots that send information to a client and interactive bots that accept and respond to a client. A third type of server class applications, Web-based clients, interacts with users through a Web server.

Notification Bots

Notification bots are real-time applications that send information to multiple clients from a centralized server (see Figure 1). The one-way transmission means that clients cannot communicate directly with the notification bot. Instead, clients must choose which events they wish to receive by using some other technique, such as a Web application.

One example of a notification bot is an application that notifies all of the users of a particular e-mail server that the server is about to go offline. Another useful notification bot would send alerts in cases of severe weather.

Figure 1. A notification bot transmits information to multiple IM clients.

Interactive Bots

Interactive bots are applications that allow multiple clients to communicate with a central server in real-time, as Figure 2 shows. They are different from notification bots in that interactive bots support two-way communications with a client. Using this approach, you can build an application that interacts with users in real-time.

Within this scenario, there are two main sub-scenarios. The first provides a user with information and waits for the user to respond, such as an application that notifies users about changes in stock prices and then gives the user the option to buy or sell. The second waits for the user to request a session with the bot and then responds to requests that the user supplies, such as a calendar application that allows a user to schedule meetings and other events while receiving reminders just prior to the meeting or event.

Figure 2. An interactive bot receives and responds to requests from multiple IM clients.

Web-Based Clients

Web-based clients provide the same basic functionality as the traditional IM client through a Web interface, thus allowing the widest possible audience to use the application, as Figure 3 illustrates. It also has the side effect of eliminating the need for a user to download local software, which reduces user concerns about the download containing a potential virus.

These types of clients are useful to organizations that wish to provide a Web-based front end to their internal IM system. For example, a company might wish to use a Web-based IM client to connect customers with a support group. Doing so maximizes the number of customers that can connect with the support group.

Figure 3. A Web-based client hosts multiple IM clients on a single machine by using a Web server to handle the user interface tasks.

RTC Client API Architecture Considerations

The RTC Client API was originally designed to support single-user workstations. This means that there are some issues that you need to consider when designing server class applications.

Threading Issues

For situations in which a server class application needs to register to the SIP server on behalf of multiple users (for example, a Web-based front end to the RTC Client API), the application should create multiple RTCClient objects scattered across multiple threads.

The number of RTCClient objects that each thread can support depends on the complexity of the application, along with the frequency of the messages to be processed for each client. A good starting point is 10 to 15 RTCClient objects per thread in the application. From that starting point, the value can be adjusted up or down to optimize performance.

For example, an author of a scalable application is interested in IM functionality only and doesn't plan to implement any presence functionality. It may be desirable to have more RTCClient objects per thread than compared to some other scalable application that implements both presence and IM functionality for each logged on user.

As the RTC Client API is apartment threaded, all of the resources needed for that particular RTCClient object should also be created on the same thread. And because the RTC Client API relies heavily on the Microsoft Windows® messages being passed between the RTC Client API and the user program, the user code that receives and processes Windows messages should be in the same thread that creates the RTCClient object.

Blocking Issues

Whenever possible, an RTC Client application should avoid calling any routines that might block the thread. Doing so would prevent the thread from processing work for any other clients on that same thread until the blocking activity has been completed. For example, database requests should be made asynchronously so that the thread isn't blocked while waiting for the database server to complete the request.

Scalability Techniques

There are a number of techniques that can be applied to improve the application performance when interacting with the RTC Client API. For the most part, these techniques optimize functionalities that are not necessary for server class applications.

Disable Media Manager

In general, server class applications send and receive text messages, along with tracking presence information. They generally do not transmit nor do they receive audio or video information. It is therefore possible to save resources by disabling the media manager in the RTCClient object. This saves both memory and threads that the RTC client would normally create in case the application will send or receive media at some future time.

To disable the Media Manager, call the IRTCClient2.InitializeEx method and specify the RTCIF_DISABLE_MEDIA value like this:

    hr = m_pClient->InitializeEx(RTCIF_DISABLE_MEDIA);
    // Implement error handling and clean up code

Disable UPnP based NAT Discovery

Another technique to save resources when the RTCClient object is initialized is to disable Universal Plug and Play– (UPnP) based network address translation (NAT) discovery. If the RTC client computer is located behind a UPnP enabled NAT, UPnP is used to communicate with the NAT device to determine the appropriate IP address and port number that should be used for the real-time session. If the application doesn't communicate to the Internet though a NAT, disabling this feature will save additional resources.

To disable UPnP-based NAT discovery, call the ITRCClient2.InitializeEx method and specify RTCIF_DISABLE_UPNP.

    hr = m_pClient->InitializeEx(RTCIF_DISABLE_UPNP);
    // Implement error handling and clean up code

Note that InitializeEx can be called only once, so all of the initialization flags must be combined. For example, this code fragment combines the RTCIF_DISABLE_UPNP flag with the RTCIF_DISABLE_MEDIA flag to disable both UPnP and the Media Manager.

    hr = m_pClient->InitializeEx(RTCIF_DISABLE_UPNP | 
       RTCIF_DISABLE_MEDIA);
    // Implement error handling and clean up code

Turn Off Detection and Recovery for IP Address Changes

Another way to improve a server class application's scalability is to disable detection and recovery for IP address changes. RTC Client API has the ability to detect IP address changes for the local host computer where it is running. In a client class application, this ability can be very useful because there are several different circumstances that will force a client to change an IP address. These reasons can range from a dropped dial-up connection to a DHCP initiated address change, so the RTC Client API opens sockets for address change notifications to detect these conditions.

None of these reasons, however, apply to the host of a server class application. Most host computers used as servers are assigned a static IP address that is not likely to change over time.

As it is not likely that the host's IP address will change, additional resources can be saved by turning off IP address changes. Disabling detection and recovery for IP address changes is accomplished by using the RTCIF_ENABLE_SERVER_CLASS flag when initializing the RTCClient object.

    hr = m_pClient->InitializeEx(RTCIF_ENABLE_SERVER_CLASS);
    // Implement error handling and clean up code

Disable Serializing

The RTCIF_ENABLE_SERVER_CLASS flag also disables serialization of SUBSCRIBE requests and getPresence SERVICE requests. This can significantly reduce the number of cycles needed to process these requests.

To avoid sending the server too many presence information SIP requests, the RTC Client API serializes the SIP requests meant for fetching presence information. Scalable applications creating hundreds of presence-related requests should turn off this behavior in order to obtain better throughput in these requests.

The RTCIF_ENABLE_SERVER_CLASS flag also disables serialization.

Care should be taken that the SIP server or any other entity to which these requests are sent is able to handle so many requests simultaneously.

Disabling Firewall Detection

The RTC Client API can detect and traverse through the Internet Connection Firewall (ICF). This functionality is useful for client-side consumer applications, as it minimizes the amount of work it takes to run an RTC Client application with ICF. However, it might not be desirable for server class applications because of the extra resources it takes.

This function is disabled when the RTCIF_ENABLE_SERVER_CLASS flag is passed to the InitializeEx method. This can save some valuable resources on the system, which leads to better scalability.

Avoid Unnecessary Events

One way to avoid extra work by the application is to carefully choose which events should be processed. Events associated with disabled features should always be turned off to reduce the amount of work performed by the RTC Client API. Thus, if the media manager is disabled within the RTC Client API, then all of the events related to processing video and audio streams should also be disabled.

Consider an application that responds only to messages received. There is no need to trap IRTCBuddyEvent or IRTCWatcherEvent —only the RTCEF_MESSAGING event is required. The following code shows how to filter out all of the events except for the RTCEF_MESSAGING event.

    // Determine the event filter
    long lFlags = RTCEF_MESSAGING;

    // Set the event filter for the RTC client
    hr = m_pClient->put_EventFilter(lFlags);
    // Implement error handling and clean up code

Avoid Sending Typing Messages Notifications

While it may be obvious, there isn't a need for any server class application to send typing messages notifications. Typing messages primarily exist for one client to let other clients in a session know that they are actively typing text within the instant messaging session. As server class applications are always online, there is no need for a server class application to expose the typing feature.

If the server class application works with a custom client application, then it's highly desirable that the client application not send typing messages either. Even though the server application ignores the messages, there's a certain amount of overhead that can't be avoided. Also, the typing messages increase bandwidth usage, so typing messages should be avoided in order to improve the scalability of the server application.

Adding Server Class Application as Always Online Contact

Another obvious issue is that a server class application is always online, so there's no need to determine the presence status of the server class application. Forcing the client class application to always add the server class application as an "always online" contact reduces the amount of work the RTC client must do to process these requests.

Although this technique doesn't directly apply to the server class application, it can be incorporated into a custom client class application interaction with the server application as follows:

    // Add the contact
    IRTCBuddy2 * pBuddy = NULL;
    hr = pPresence->AddBuddyEx(
            bstrURI,
            bstrName,
            NULL,
            VARIANT_TRUE,
            RTCBT_ALWAYS_ONLINE,
            NULL,
            0,
            &pBuddy);

    // release pPresence

Techniques to Improve Scalability in Web-based UI Scenario

Depending on the transport protocol, the RTC Client API uses one or more sockets for each RTC client. When using TCP as a transport, the RTC Client API utilizes three sockets for each registered client. The first socket is the client's listening port, and the second socket is for the new dynamic port that the client creates to connect to the registrar to log on. The third socket is created when the client accepts the connection request on its listening port from the registrar for all notifications. In contrast to TCP, the RTC client uses only one socket when the client is logged on using Transport Level Security (TLS), as there is no listening port in TLS mode.

If an application is running out of available ports for use, the number of available ports can be increased by creating the following registry key and registry values:

HKLM\Software\Policies\Microsoft\Windows\RTC\PortRange
HKLM\Software\Policies\Microsoft\Windows\RTC\PortRange\Enabled (REG_DWORD) set to 1
HKLM\Software\Policies\Microsoft\Windows\RTC\PortRange\MinSipDynamicPort (REG_DWORD) set to 5354
HKLM\Software\Policies\Microsoft\Windows\RTC\PortRange\MaxSipDynamicPort (REG_DWORD) set to 65535

The RTC Client API creates an event window for each socket that it uses, so the number of handles required to support a large number of clients can get rather large. As Windows limits the number of handles available to a single process, it might be necessary to implement one of more of the following techniques to improve scalability.

When the RTC client uses TLS for transport, the number of handles required is only one-third that of those required when using TCP as a transport.
When you split the RTC clients over multiple processes, the number of handles is reduced for each process.
It is possible to increase the number of handles available for each process by changing these two registry keys.

HKLM\Software\Microsoft\Windows NT\CurrentVersion\Windows\USERProcessHandleQuota set to 10000
HKLM\Software\Microsoft\Windows NT\CurrentVersion\Windows\USERPostMessageLimit (REG_DWORD) set to 10000

Scenarios

This section illustrates how you might implement each of the three server class application scenarios. Note that in all three scenarios, it is assumed that the appropriate RTC Client API objects have been properly initialized and that any necessary sessions have been established.

Scenario 1: Notification Bot

This application allows clients to register with a notification bot that will send them an instant message each time a particular stock price changes. The following code waits for a stock price change event to occur and then triggers the OnStockPriceChangeEvent routine.

// Wait for a stock price change event
while(GetMessage(&msg, NULL,0, 0))
{
   TranslateMessage(&msg);
   switch(msg.message)
      {
      case WM_EVENT_STOCKPRICE_CHANGE:
         OnStockPriceChangeEvent();
         break;

      case default:
         // Necessary code
      }   
}

Within the OnStockPriceChangeEvent routine, a while loop is used to iterate through the list of contacts by using the EnumerateBuddies method. Within the while loop, each IRTCBuddy object is checked to see if the associated contact is online. If the contact is online, a single party IM session is created and then the appropriate message is generated and sent to the client.

// Iterate through buddies registered for this event
// Create single party IM sessions with non-offline 
   // contacts
// Send IM message

IRTCEnumBuddies * pEnum=NULL;
hr= m_pClientPresence-> EnumerateBuddies (&pEnum);
// Handle hr != S_OK error

IRTCBuddy * pBuddy=NULL;
RTC_PRESENCE_STATUS enStatus;
IRTCSession *pSession=NULL;

while(pEnum->Next(1, (IRTCBuddy **)&pBuddy, NULL) == S_OK)
   {
   hr=pBuddy->get_Status (&enStatus);
   // Handle hr != S_OK error

      if(enStatus  ! = RTCXS_PRESENCE_OFFLINE)
      {
      // create single party IM sessions
      hr=pClient->CreateSession (RTCST_IM,NULL, NULL,
         NULL, &pSession);
      // Handle hr != S_OK error

      BSTR bstrURI;
      hr=pBuddy->get_get_PresentityURI(&bstrURI)
      // Handle hr != S_OK error

      // Add the buddy as the participant
      hr=pSession->AddParticipant (bstrURI, NULL, NULL);
      // Handle hr != S_OK error

      DWORD dwCookie; // initialize cookie as necessary
      BSTR bstrMsg=::SysAllocString(
         L"Stock price changed..");      

      // Send the message
      hr=pSession->SendMessage (NULL, bstrMsg, dwCookie);
      // Handle hr != S_OK error
      }
   }

Upon receiving the SessionOperationComplete event, the application then terminates the session. A pointer to an IRTCSession object is extracted from the IRTCSessionOperationCompleteEvent object that was passed to the event using the get_Session property. Then the Terminate method is used to terminate the session.

// Terminate the session
IRTCSession *pSession=NULL;
hr=pEvent->get_Session(&pSession);
// Handle hr != S_OK error

// Terminate the reason
pSession->Terminate(RTCTR_NORMAL);

// Release all references

Scenario 2: Interactive Bot

In this scenario, an interactive bot waits for clients to connect and then accepts messages from the clients. After the message is decoded, a response to the message is prepared and returned to the client. As this application could have hundreds of simultaneous IM sessions, it is important to distribute the work across multiple threads.

As part of the initialization process, the RTCClient object is set to automatically accept incoming sessions.

// During initialize phase, set the clients to auto 
// answer connections

hr = m_pClient->put_AllowedPorts(RTCTR_TCP, 
   RTCLM_DYNAMIC);
hr = m_pClient->put_AnswerMode(RTCST_IM, 
   RTCAM_AUTOMATICALLY_ACCEPT);
hr = m_pClient->put_AnswerMode(RTCST_MULTIPARTY_IM, 
   RTCAM_AUTOMATICALLY_ACCEPT);
// Handle hr!=S_OK error for each call

When a message arrives from a client, the OnMessageEvent is called. This routine extracts various pieces of information about the client's message including the session object and the client's name, as well as the type and text of the message. If the type of message is RTCMSET_MESSAGE, then the information in the message is parsed to determine how it should be processed. In this example, messages must begin with QUERYDB in order to be processed.

// Accept incoming calls and handle appropriately

HRESULT OnMessageEvent(IDispatch *pDispatch)
   {
   ...
   
   IRTCMessagingEvent* pME = NULL;
   IRTCParticipant* pPart = NULL;
   IRTCSession *pSession = NULL;
   // Declare other local variables used below.

   hr = pDispatch->QueryInterface(IID_IRTCMessagingEvent,
      (LPVOID *)&pME);
   // Handle hr!=S_OK error

   hr = pME->get_Session(&pSession);
   // Handle hr!=S_OK error

   pME->get_Participant(&pPart);
   // Handle hr!=S_OK error

   hr=pPart->get_Name(&bstrName);
   // Handle hr!=S_OK error
      
   hr=pME->get_Message(&bstrMsg);
   // Handle hr!=S_OK error
       
   // Check the messaging event type so we don't process 
   // status messages
   RTC_MESSAGING_EVENT_TYPE enMessageType;
   pME->get_EventType(&enMessageType);
   if (enMessageType == RTCMSET_MESSAGE)
      {

      if( wcsncmp(bstrMsg, L"QUERYDB ", 
         wcslen(L"QUERYDB ")))
         {
         // We want to store the session and not pass it 
         // across threads. So we can put the session in
         // some sort of list structure and receive an id
         DWORD dwId = m_pList.Add(pSession);
         
         // Post the command to the worker thread
         PostThreadMessage(g_dwWorkerThreadID, WM_QUERYDB, 
            (LPARAM) bstrMsg, (WPARAM) dwId);
         }
      }

   // Release all references and clean up
   ...
}

Rather than process messages inline, the information related to the message is added to a list for processing and a command is posted to a worker thread indicating that there is a new message waiting to be processed. This leaves the main thread free to continue to accept RTC requests as they come in, making the entire application more responsive.

The worker thread waits for messages to be sent. When it receives a message, it translates the message and generates a response. Then a command is posted to the main RTC client thread indicating that the response to the message has been generated and it should be sent to the client.

// The worker thread receives the message and generates 
// a response

while(GetMessage(&msg, NULL,0, 0))
   {
   TranslateMessage(&msg);
   switch(msg.message)
      {
      case WM_QUERYDB:
         BSTR bstrMessage = msg.lParam;
         DWORD dwId = msg.wParam;
         BSTR bstrResultString = NULL;

         // Process the bstrMsg query and get the result
         // in bstrResultString 

         PostThreadMessage(g_dwMainThreadID, 
            WM_QUERYRESPONSE, (LPARAM) bstrResultString, 
            (WPARAM) dwId);

         break;
            
      // Any other custom messages you may have
         default:
         DispatchMessage(&msg);
      }
   }

When the main thread receives the command from the worker thread stating that the response is ready to send, the session is retrieved from the list of sessions and the message is returned to the client.

// Main thread returns response to the client

while(GetMessage(&msg, NULL,0, 0))
   {
   TranslateMessage(&msg);
   switch(msg.message)
      {
      case WM_QUERYRESPONSE:
         BSTR bstrResultString = msg.lParam;
         DWORD dwId = msg.wParam;

         // Retrieve the session from our list:
         IRTCSession *pSession = m_pList.GetSession(dwId);
         pSession->SendMessage(NULL, bstrResultString, 0);
         ::SysFreeString(bstrResultString);

         break;
            
         // Any other custom messages you may have
      default:
         DispatchMessage(&msg);
      }
   }

Scenario 3: Web-Based User Interface

When you build a Web-based user interface, it is very important to distribute the RTCClient objects among multiple threads. Distributing the RTCClient objects among multiple threads limits the amount of work that any one thread has to perform, which allows the application to scale and handle large numbers of clients. In the main thread, requests are received and dispatched to a worker thread that will create the RTCClient object.

This routine looks for a worker thread with less than 10 RTCClient objects already allocated. If it can't find a worker thread with less than 10 RTCClient objects, a new thread will be created. In either case, a message containing a command to create a new client is passed to the thread.

// Main thread functions and callbacks

// When a new user logs on to the service using 
// web interface
HRESULT OnStartClient()
   {
   ...
   // In this sample, we limit the number of clients 
   // to 10 per thread. This value should be adjusted 
   // as necessary to optimize performance
   if (one of the worker threads has < 10 RTCClient 
      objects)
   {
      // Post to the worker thread to create a new 
      // RTCClient Client Object and log it on
      PostThreadMessage(dwWorkerThreadID, WM_CREATECLIENT,
         ...); 
      }
   else
      {
      // create a new thread
      DWORD dwThreadID; // Out parameter
      _beginthreadex(NULL, 0, &UpdateWebUI, NULL, 0,
          (unsigned int *) &dwThreadID);
      PostThreadMessage(dwThreadID, WM_CREATECLIENT, ...);
      }
   }

The main thread is still responsible for updating the Web user interface, so each of the worker threads uses the UpdateCallBack event to send information back to the main thread.

// The callback to be called by the worker thread to
// notify about RTC events

typedef void( * UpdateCallBack)( void * );

void UpdateWebUI(void *pUpdateInfo)
   {
   // Update the UI based on the parameter
   }

In the worker thread, messages are received and decoded to determine which function should be performed. For example, a WM_CREATECLIENT message results in the CreateAndLogon routine being called. Other messages trigger other routines within the worker thread. Also a reference to the callback routine is saved so that the worker thread can post commands with relevant information to the main thread and update the Web user interface through the callback routine.

// Worker thread process and message loop

unsigned int __stdcall RTCControllerThreadProc(
   void *pParam )
   { 
   // Store the Callback function as a member:
   m_pCallBack = (UpdateCallBack) pParam;

   while(GetMessage(&msg, NULL,0, 0))
      {
      TranslateMessage(&msg);

      // Sample Command: Logon all clients owned by 
      // this thread
      if (msg.message == WM_CREATECLIENT)
         {
         CreateAndLogon( );
         }

      // ... Process other commands
      else
         DispatchMessage(&msg);
      }
   }

The CreateAndLogon routine creates a new instance of the RTCClient2 object and then performs any other tasks needed to initialize the client session.

// Sample logon code for the worker thread

HRESULT CreateAndLogon()
   {
   // Create Client and Logon

   IRTCClient *m_pClient = NULL;

      hr = ::CoCreateInstance(CLSID_RTCClient, NULL,
         CLSCTX_INPROC_SERVER, __uuidof(IRTCClient2), 
         (LPVOID *) &m_pClient);
      // Check hr != S_OK

      // Log the client on...

      return S_OK;
   }

As RTC events occur, the following event handler is called. This event handler lives in the worker thread because RTC interface pointers can't be used across threads. Depending on the event that was triggered, the event handler can request that the main thread return information to the client by the callback routine.

// Handle events from the RTC API

RTCEventHandler(RTC_EVENT enEvent, IDispatch *pDispatch)
   {
   // An RTC Event Occurred. Notify the Web UI. 
   // Here package the information in a separate data 
   // structure. Passing the IRTC* interface pointers 
   // across threads is not allowed
   *m_pCallBack(...);
   }

Conclusion

Building a scalable application is often based on looking for facilities that the application doesn't need or use, and then turning them off. Although saving a few bytes of memory here and there might be insignificant to an application servicing one user, the savings can really add up when multiplied by hundreds or thousands of users. Using the techniques described in this paper, you can design scalable RTC applications to meet the needs of any organization.

See the following resources for further information: