Using the RTC Client API

 

Microsoft Corporation

November 2003

Applies to:
    Microsoft® Real-Time Communications Client API version 1.2
    Microsoft Office Live Communications Server
    Instant messaging applications

Summary: Microsoft Windows Messenger 5.0 relies on the Real-Time Communications (RTC) Client application programming interface (API) to handle real-time communication. Microsoft has published an updated version of the RTC Client API so that developers can extend their own applications' behavior with real-time communication features. This article provides a brief introduction to some of the key issues related to incorporating the RTC Client API into a C++ application. (45 printed pages)

Contents

Introduction
Basic Concepts
Benefits of the RTC Client API
Session Types to Use with the RTC Client API
Supported Functionality
Security
Programming with the RTC Client API
Programming Considerations
Conclusion
Related Links

Introduction

Real-time communications are becoming a critical part of today's business world. Applications such as Microsoft® Windows® Messenger are becoming commonplace in many organizations. Developers also need tools to either create their own custom real-time applications that can be tailored to a particular business need, or to integrate real-time communications into their existing line-of-business (LOB) applications.

Real-world collaboration application examples include one that assists customers in finding and interacting with support professionals. Another allows corporate employees to collaborate with one another through such RTC Client API features as presence information and instant messaging.

The RTC Client API version 1.2 enhancements help build more secure, versatile, and reliable collaboration and communication applications. This paper explores the fundamentals of using the RTC Client API, beginning with an overview of different areas of functionality that the RTC Client API exposes. The provisions in the RTC Client API that help make for more secure real-time communication are also discussed.

The following sections cover how to use the RTC Client API, including: how to log on to a Session Initiation Protocol (SIP) Server; how to determine presence information of individuals; how to establish a session with another user; and how to create and persist contact lists. This paper also covers side-by-side execution, and using multiple types of devices.

Although this paper focuses on using the RTC Client with C++, any programming language that can access COM components can take advantage of this API. This includes .NET languages, such as Microsoft Visual Basic® .NET and Microsoft Visual C#®, which can access the RTC Client API through COM interoperability.

Basic Concepts

The following section outlines some of the basic concepts that developers need to be aware of before building their applications. By understanding these concepts, the developer can build more effective applications for their needs.

Session

A real-time communications session is an established conversation between two or more real-time users (clients), and involves one or more types of media. The clients might use a Session Initiation Protocol (SIP) server to route the signaling traffic, or they might use a peer-to-peer session in which no SIP server is involved.

SIP Server

A SIP server helps users' real-time communications applications find one another and establish a real-time communications session. The SIP server can also help clients achieve a higher level of security, manage presence information, and persist contact lists for users.

A SIP server can act as a SIP proxy that routes and redirects SIP messages across SIP clients. It can also maintain a database of network locations of SIP clients and ensure that all SIP clients update their location information. Lastly, it can act as a Presence Agent Server (PAS) to maintain presence documents for all registered users, and notify other users whenever presence information changes.

Presence Information

Presence information is an indicator of the availability of a user for real-time communication activities. It is also an indicator of the level of the user's activity for the device publishing the presence information. The presence information can be as simple as whether the user is online or offline, or as complex as the user's meeting schedule.

The user's presence information is an aggregate published by all the communication devices he or she is using.

Contact Lists and Watchers

Each user has a list of other users they wish to communicate with on a regular basis. This list is referred to as a contact list. A user actively tracks the presence information of others in the contact list. Tracking presence information of a contact is referred to as "watching presence" and the user watching presence is called "watcher."

Roaming

Users should be able to access information about their communication service account, such as contact lists, authorization rules, and other user preferences, from whatever device they happen to be using. With this information persisting on the server, the user can roam freely from one device to another without losing that information. The features in the RTC Client API that allow an application to keep user account information on the server and allow a user to access it are referred to as "roaming."

Benefits of the RTC Client API

The RTC Client API is much more than a tool to build text-based, instant messaging applications. It allows developers to create truly multimodal communication applications, including PC-to-PC, PC-to-phone, phone-to-phone, and instant messaging. It also supports audio and video calls, text-based instant messaging, application sharing, and whiteboard. The RTC Client API also works with a registrar server to manage presence information, which facilitates communications between individuals.

Building Applications by Using the RTC Client API

In addition to supporting real-time communications between computers, the RTC Client API facilitates voice communications between computers and telephones, and between two or more telephones. Depending on the type of application, a user at one computer can call a user at another computer, or call a telephone. Similarly, a telephone can call another telephone or a computer. All communications occur over an Internet Protocol (IP) network, or through devices attached to the IP network.

Of course, the equipment on both sides of the communication connection must be compatible for the particular type of communication. This means that a computer must have a microphone and speakers in order to facilitate a call. Likewise, if you want to have a two-way video conversation, both computers must include cameras, microphones, and speakers. A compliant server is required for phone-to-phone calls.

There are two classes of applications that can be built using the RTC Client API: general desktop applications aimed at user interaction, and scalable applications that provide a service on behalf of the user. The following sections outline these types of applications.

Desktop Applications

Windows Messenger is typical of the desktop class of applications. It allows two or more parties to communicate in real time by typing messages. Often, instant messenging applications support other communication technologies, such as voice and video. Windows Messenger 5.0 supports these types of communications, and adds some useful features, such the ability to communicate with telephones and the ability to share applications between two clients.

Another good example of a desktop application is a stock-purchasing system that allows users to manage, buy, and sell orders. From this system, a user can determine which orders are incoming, outstanding, or completed. By integrating real-time communications into this application, a developer can provide the user with the ability to see that an incoming order is invalid and immediately determine if the individual placing the order is online. If the individual is online, the user can, with one simple click, immediately communicate with that individual to clarify the order.

Scalable Applications

The RTC Client API can also build more scalable applications, such as a notification bot. A notification bot tracks the presence information of hundreds of users interested in being notified when a certain event occurs. It sends out a real-time message to these users when the event occurs. Another example is an interactive bot. This type of application accepts incoming real-time session requests from authenticated users and helps them through interactive conversation. It does so by accessing a knowledge base, or by putting them in touch with customer service representative.

Session Types to Use with the RTC Client API

The RTC Client API can build a rich set of communication sessions. These session types use different types of media in order to bring users together. The media types can be used for both person-to-person communications, as well as application-to-application or application-to-user communications.

Instant Messages

The primary focus of instant messages is person-to-person communications. With this media type, a user is able to type short text messages that one or more participants can share in an instant messaging (IM) session. This is often a more efficient method of communicating than e-mail, since responses are typically much quicker.

Enhanced Instant Messages

The RTC Client API opens the opportunity for performing more advanced types of communications in which one of the parties is a computer application. The application can send notifications of particular events, or can allow a user to interact with a remote computer in real time.

Audio and Video

Even though IM is a non-intrusive form of communication, it lacks some of the richness needed for detailed communication. The RTC Client API includes the ability to start an audio and video session in which each participant can be heard by the other.

Telephony

While the vast majority of telephones are on the public switched telephone network (PSTN), it is becoming increasingly cost-effective for organizations to use IP-based telephone networks. Doing so requires specialized telephones, or specialized adapters that allow conventional telephones to attach directly to the TCP/IP network. Using the RTC Client API, it is possible to establish a session with a conventional telephone that is connected to the PSTN. An Internet Telephony Service Provider (ITSP) provides a gateway between the IP network and the PSTN. The RTC Client API also allows connecting voice sessions on PC-to-IP telephony devices.

Application Sharing

When the RTC Client API application is facilitating PC-to-PC communications, users can share applications. This allows two people to use the same program, which can be any standard Windows application, such as Microsoft Outlook® or Microsoft Internet Explorer.

Application Specific Sessions

These standard forms of media might not be rich enough for some types of communication. There are applications written to communicate additional context information as part of the session, or to use the rendezvous technologies available as part of the RTC Client API to start completely new types of media sessions. For example, users might want to start a new multiparty game with their friends. The participants' application can use the RTC Client APIs to determine the availability of other players and establish a real-time session to control the gaming media.

Supported Functionality

The RTC Client API can be used to create various functionalities around real-time communication scenarios. This section details the most common scenarios supported by the API.

Registration and Provisioning

Each RTC client registers its location with the SIP registrar so that other clients can find its exact location. To create a registration session with the SIP registrar, the application should create an RTC profile and enable it by calling the EnableProfileEx method on the profile object.

The RTC profile, also called the provisioning profile, stores information that allows the client to access services on the network. The provisioning profile sets the capabilities of the user and stipulates which types of sessions the client can initiate, along with the domain and servers that can be accessed.

The provisioning profile also provides the API version, user settings, server settings, and domain settings that the RTC Client API uses to initiate a session. The client can also specify an ITSP, or a third-party, corporate-deployed server to provide telephony services. Multiple servers, such as the SIP proxy or registrar server, or the gateway server for calls routed over the PSTN network, can also be listed in the profile.

The client application is responsible for creating the XML profile that stores the client's provisioning information. Applications can obtain the profile object from the XML string by using the CreateProfile method, or can use the GetProfile method to let the RTC Client API generate the profile.

If the application calls the GetProfile method and passes NULL for the SIP registrar parameter, the transport mechanism used with the RTC Client API will do a Domain Name System (DNS) query to determine that information. Separate profiles can be used for different types of services by using a variety of providers, proxy servers, or gateways. The refresh time of the SIP registration session created as a result of enabling the profile is decided by the server and cannot be changed by the application.

Publishing Presence

Presence information indicates the user's availability for collaboration activities, as well as their current collaboration activities. The RTC Client API allows users to publish their location and online status information, as well as more extensive presence information, such as user-defined data, if the server supports presence roaming.

Users can log on from multiple devices at the same time, and set the online status information for each device separately; they aren't limited to a single presence device. Watchers receive presence information for all the devices that a user is logged on to.

Using the RTC Client API, the client application can act as the Presence Agent (PA) for the user. In this case, the contact list is stored on the local computer, as is the presence information for each active contact. This limits the user to a single presence device that stores all the contact information.

The RTC Client API also supports a mode in which the SIP server acts as the Presence Agent Server (PAS) for the client, if the roaming features for presence are enabled. When the server acts as the PAS, the client supplies presence information to the server and the server distributes this information to the watchers.

For the client to enable the presence functionality and publish presence information, the application must first create a profile and then register it with the server. Before registering, presence must be enabled by using the EnablePresenceEx method.

The RTC Client API supports setting the online status, and some special notes about the user's state. The RTC Client API also supports setting extensible presence information, such as e-mail address, display name, and phone number. The IRTCClientPresence2::put_PresenceProperty method is called to set each of these properties separately.

The RTC Client API also allows the user to set user-defined data, such as global positioning information, user action, or user capabilities. The application calls the IRTCClientPresence2::SetPresenceData method to set the user-defined presence data. The IRTCPresenceDataEvent interface is fired when the operation is complete.

The RTC Client API version 1.2 enables users to retrieve presence status and presence information for several different presence devices at the same time. A presence device is any device that a user can register with the SIP server.

For example, a user can be registered on the server as actively present on a SIP-aware telephone, a laptop, and a mobile computing device. When a user registers on the server from more than one device, they create Multiple Points of Presence (MPOP) for the watchers to view their presence information.

Using MPOP requires that the server supports this feature. The Microsoft Office Live Communications Server, which implements SIP, supports MPOP. The RTC Client API includes interfaces to enumerate the presence devices for a specific client. When a user watches a client who is currently available on a number of presence devices, the user can enumerate through all the presence devices and retrieve information for each device—including enhanced presence information, if available.

The following information is available for each presence device for a contact:

  • Notes. Typically a short note on the device.
  • Status. An enumerated status value, such as online, offline, away, or idle.
  • Presence Property. A standard, enumerated, presence property. This can include presence information, such as e-mail address, phone number, or display name. The PAS usually decides whether to include this information in the presence document.
  • Presence Data. A string of data that contains enhanced presence data. This may be an XML string, and can contain any type of presence information, such as global positioning system (GPS) information or Outlook calendar data.

Usually, registering from multiple devices creates other issues, such as which device to route the SIP session invitation to. These issues need to be handled by the SIP registrar and proxy that the client is working with.

Contact Management

Users need to manage their contact lists. The contacts are tracked for their presence information. The RTC Client API allows the application to add a Universal Resource Identifier (URI) to the contact list by calling the AddBuddyEx method. The AddBuddyEx method also creates a presence subscription session for the URI, unless it is a polled, or always an online or offline, contact.

The contact list is persisted in a file on the user's computer. Contacts can be organized into several contact groups. The RTC Client API allows the applications to create multiple contact groups, and to add contacts to various groups. The same contact can be added to multiple groups.

The RTC Client API also supports roaming profiles in your contact list. This feature allows the user to store contact lists on the SIP server, thus enabling access to the same contact list, regardless of location.

Roaming may not be available on all SIP servers. However, it is supported on Live Communications Server. If the SIP server supports a roaming contact list, the client application can store contacts in a centralized location, as well as on the local device.

When a registered user logs on, the current client device will be updated with this stored information. The RTC Client API will propagate a user's changes to this information to the SIP server. If the server does not support the storage of roaming data, the information will be stored locally only if local storage was set while enabling this profile.

To receive roaming events, the event filter must be set to allow RTCE_ROAMING events. When an event of this type is handled, the IRTCRoamingEvent interface will be present on the event object, and the type of the event can be returned by calling IRTCRoamingEvent::get_EventType. When a user registers from multiple devices, the updates in the properties of a contact or a group object are also conveyed to the client through the roaming session. RTCE_BUDDY event is fired to indicate the changes in a contact's properties. RTCE_BUDDY_GROUP event is fired to indicate the changes in a contact group's properties.

Polling Presence

Adding a contact to the list is a two-step process—adding the contact, and creating subscription session for the contact. An application might want to retrieve the presence information for a particular URI only upon request; it can choose not to be notified by the server every time the URI's presence information changes.

This can be achieved by creating a contact of type RTCBT_POLL, using the AddBuddyEx method. This method creates a request to the server for the presence information associated with the specified SIP URI. The format of the presence document sent by the server is the same as a subscribed contact. The application isn't notified when the presence information changes. A new request is sent to the PAS every time the application calls Refresh on the associated IRTCBuddy object to retrieve this user's latest presence information, unless the information is retrieved from the cache.

Instant Messaging

Instant messaging is one of the most widely-used features of the RTC Client API. The application can create instant messaging sessions with other SIP clients by using the CreateSession method with a session type of RTCST_IM. Apart from sending out instant messages, the application can also send out the activity status of the user in the context of the IM session (such as, the user is typing) by calling the IRTCSession::SendMessageStatus method.

The RTC Client API also supports multiparty instant messaging (MIM) sessions and branching messages, which are preferred to simple IM sessions. To start MIM, the application needs to create a specific session of type RTCST_MULTIPARTY_IM. Once a multiparty session is created, participants can be added through the IRTCSession interface returned by CreateSession.

To determine if participants can be added to this session, the application should retrieve the IRTCSession::get_CanAddParticipants property. For ideal performance, a limit should be set on the number of participants added to any given session. A good limit range is between six to ten participants.

The current participants in an active session can be retrieved as an enumerated list by calling IRTCSession::EnumerateParticipants. When a multiparty session is established, messages can be transmitted by calling IRTCSession::SendMessage, and the message will be forked to all participants. A client can receive an incoming IM session that already has multiple participants. The session state object reflects this. A receiving client can retrieve an IRTCSession interface from the IRTCSessionStateChangeEvent interface passed when the RTCE_SESSION_STATE_CHANGE event is received, and obtain a list or collection of active participants by calling EnumerateParticipants or get_Participants. When the client receives an incoming MIM session, all participants within the session will have a participant state of RTCPS_INCOMING, indicating that they have been notified of the client's addition to the session. After the session has been successfully established, the participants' states will change accordingly.

Participant state changes are tracked through the participant state change event (RTCE_PARTICIPANT_STATE_CHANGE) that is sent to all other participants in a session. Error codes are supplied for failed invitations, network timeouts, the lack of multiparty support, insufficient security levels, and existing participation. These status codes can be obtained by calling the IRTCParticipantStateChangeEvent::get_StatusCode method.

Multimedia Calls

Applications can use the RTC Client API to create and receive multimedia calls between SIP endpoints. Unlike the IM sessions, these calls are restricted to two participants only.

The application should create a session of type PC_TO_PC in order to create a multimedia session. The value of the PreferredMedia property on the RTCClient object is used to decide which media should be added to the new call.

The application can choose to add or remove media streams during the lifetime of the call. The supported media streams include audio, video, and T.120. The T.120 media stream is used for creating an application-sharing, or whiteboard, session. The application can choose to create a media stream in send or receive mode. Only one audio, video, or T.120 call is allowed for a given RTCClient object.

Call Control

The RTC Client API supports call control features that allow traditional PSTN-type call controls, such as forward, hold, and transfer (or refer) on audio or video RTC sessions.

Hold and Unhold

Placing the call on hold stops all media streams associated with the call, with no confirmation needed from the party that is placed on hold.

The Hold method requires a cookie that the RTC Client API uses to track the session placed on hold. Subsequent calls to Unhold use the same cookie to identify which session to place off hold. More than one session at a time can be placed on hold.

The state of the initiating party changes to RTCSS_HOLD if the operation succeeds. The application for the party that is placed on hold receives notification through the IRTCMediaEvent interface that the media stream for the call has stopped. The session state remains in the CONNECTED state while the call is on hold.

The third party does not have the option to allow or disallow placing the call on hold. When the call is placed off hold, an IRTCMediaEvent notification is fired to indicate that the media stream has started again. The application that initiated the hold operation is the only one that can place a call off hold. The session state changes to CONNECTED when Unhold succeeds. Note that it is possible that a session can get disconnected while it is in the hold state.

Forwarding

An RTC Client API application can forward incoming calls for IM, MIM, telephony, and multimedia sessions, if the session state is RTCSS_INCOMING. The call is forwarded to another SIP URI that can be reached with the help of the SIP server. The original session becomes disconnected once the forward operation is completed.

The calling application has two options to choose from when a call that it initiates is forwarded: automatically allow the redirected call to occur, or receive notification of the incoming redirection. If the application chooses to be notified about the redirection, the RTC Client API will (automatically) not create the new session to the forwarded-to URI.

Transfer (Refer)

Session transfer (or refer) is allowed only on existing sessions that are either voice, video, or MIM with only two participants. Call transfer can be initiated by either party in the session, and each party decides, independently, whether to maintain the call until the transfer is complete.

The recipient of a transferred call can look at the session cookie and the URI of the referring party to determine whether or not to accept the call. The party being transferred is notified of the transfer, and has the option to accept or reject the transfer. The transferred party notifies the transferring party about the state of the new call to the transferred-to endpoint.

To implement a consultative transfer, the transferring party can first connect to the transferred-to party after putting the transferred party on hold. Once the transferred party receives notification about being transferred, the application can then initiate a new call to the transferred-to party.

Session Negotiation

With the RTC Client API, the application has the choice to handle session negotiations for each new session or to let the RTC Client API automatically handle the session negotiations.

If the client application decides to handle negotiations, it can choose to handle outgoing INVITE requests, incoming INVITE requests, or both. The application has access to the body of the SIP INVITE request to handle session description negotiations.

At any time during the session, it is also possible for the client application to renegotiate the session description. This can be Session Description Protocol (SDP) data, or it can be another type of session descriptor known to the application. To handle the media negotiations on its own, the application should create a session of type RTCST_APPLICATION.

The Session Description Protocol (SDP) found in RFC 2327 is commonly used to describe a multimedia session for the purpose of initiating a session. The SDP is used to advertise a multimedia conference and convey information about the media streams. The SDP contained in the SIP INVITE request includes details of the session, such as the session name and purpose.

Although the RTC Client API uses the SDP to negotiate session media streams, the application may decide to provide another type of session-negotiation protocol in the body of the SIP INVITE request. The media-stream negotiations are entirely up to the parties involved.

The RTC Client API application is responsible for forming the body of the SIP INVITE message. The application has an option to disable the RTC media stack globally. If the media stack is disabled globally, the application needs to handle the session negotiations for all incoming or outgoing sessions involving voice, video, or T.120 media streams. The application can also selectively enable or disable the RTC media stack for certain sessions by using IRTCSessionDescriptionManager::EvaulateSessionDescription. If the application enables the RTC media stack, then it can select which incoming media sessions (by calling Accept(mediaTypeSelected)) or outgoing media sessions (by calling put_PreferredMediaType()) to negotiate.

Searching for users is easily implemented as part of an RTC Client API application. IRTCUserSearch interface has two methods: IRTCUserSearch::CreateQuery, which generates a query object; and IRTCUserSearch::ExecuteSearch, which takes a query object and performs the search, returning the results through the search results event (RTCE_USERSEARCH).

The RTC Client API application is allowed to set a number of properties on the query object, that filter the search query based on the Lightweight Directory Access Protocol (LDAP) user object attributes. These attributes are available in the server user database and can be e-mail addresses, office numbers, user names, and others. Each term also has a corresponding search string that is used as a prefix filter (the supplied string value followed by zero or more characters) for all search terms of a given type.

For example, a search term might be set for the user object attribute with an LDAP name of "telephoneNumber" and a search string of "(425)". This would retrieve all user entries with the phone number area code "425," formatted with parentheses. The search performance is entirely dependent on the server implementation.

Some common user object attribute LDAP names are:

  • User display name—"displayName"
  • User first name—"givenName"
  • User last name—"sn"
  • Company name—"company"
  • User street address—"streetAddress"
  • Last time user logged on—"lastLogon"

Apart from the standard search terms supported in the RTC Client API, the application can specify certain search terms specific to the SIP server that is performing the search operation. For example, the Live Communications Server matches the value of the search parameter "msRTCSIP-PrimaryUserAddress" against a user's SIP URI. The RTC Client API allows the application to set a maximum search wait time, as well as a limit on the number of results returned. The default maximum search time is 30 seconds.

Phone Calls

The RTC Client API supports functionality to create and receive phone calls. The application can create a phone call by creating an RTC session of type RTCST_PC_TO_PHONE. The RTC Client API will then choose a PC-to-phone SIP proxy to connect this call.

The application should create the profile with PC-to-phone proxy in order to create phone calls. The application can also create a phone call by adding TEL participant URI to a PC-to-PC session. If a TEL Uniform Resource Locator (URL) is specified, the RTC Client API creates a SIP INVITE with the to: header set to "to: phone-number".

RFC 2806, URLs for Telephone Calls, defines the formats for valid TEL URLs; however, RTC Client API version 1.2 supports only a subset of those formats. The application is responsible for creating a valid TEL URL, and the number should follow the canonical phone number format. The RTC Client API removes any qualifiers and sends only the canonical number to the gateway, which ultimately determines if the phone number is acceptable. The following is an example of a valid TEL URL: TEL: +1-425-555-0123 (TEL URL with global phone number format).

Security

The RTC Client API has enhanced support for communicating with other SIP endpoints in a manner that enhances security. Security settings for authentication and encryption can be set programmatically. Writers of RTC Client API applications should strongly consider enabling the security features available in the RTC Client API.

In order to use these security features, the client needs to log on to a SIP server that supports them. The SIP communication is not secure when the client is working in a peer-to-peer mode.

The following topics outline the enhancements and requirements that enable these security features.

Authentication

A SIP server will challenge the client to obtain authentication information. The RTC Client API supports several password-based authentication mechanisms to authenticate the client to the server, including Kerberos, NTLM, Digest, and Basic. For RTC Client API version 1.2 applications, the allowed authentication methods are specified in the allowedauth property of the IRTCProfile2 interface.

If the IRTCProfile2::put_AllowedAuth method is not called, and the "auth" attribute is absent in the <sipsrv /> tag in the profile XML, by default, the RTC Client API allows NTLM and Kerberos authentication mechanisms. When the RTC client is challenged by the server, the more secure authentication method allowed by both the client and the server is used. The priority order for authentication methods is Kerberos, NTLM, Digest, and Basic. The application can also specify that the default Windows NT user logon credentials be used for authentication by specifying logoncred in the allowedauth property flags. The logoncred method cannot be the only method specified in the allowedauth property; at least one other authentication type, such as Kerberos or NTLM, must also be specified.

It is possible that the server may not challenge the client; and therefore, the client may never be validated. When using the basic authentication scheme, the password is sent out to the server in plain text, so the transport must always be Transport Layer Security (TLS). This ensures that the password is encrypted as part of the transport layer. However, when using the NTLM and Kerberos authentication methods, the transport can be either TCP or TLS.

The client can also authenticate the identity of the SIP server it is using for real-time communications. The client must specify TLS transport in the GetProfile method in order to validate the SIP server's certificate while establishing a TLS link with the server. Authenticating the identity of the server helps protect the client from a variety of spoofing and impersonation attacks.

Signaling Privacy

The RTC client application should use a SIP server that supports TLS transport in order to encrypt the SIP signaling traffic. All the SIP signaling messages should be routed through this default server. SIP messages are not encrypted if the client is working in peer-to-peer mode.

Media Privacy

The application can choose to encrypt the multimedia traffic going peer to peer between the caller and the called, to help achieve better media privacy. The RTC Client API allows the application to set default encryption levels on the IRTCClient object or on the IRTCSession object separately.

The encryption levels set on the client object are used for all sessions created after the levels are set. The encryption levels set on the session object are used on a per-session basis.

The mechanism for specifying the security levels on the client object is the IRTCClient2::put_PreferredSecurityLevel method. This method specifies both the media type (RTC_SECURITY_TYPE) and the security level (RTC_SECURITY_LEVEL). For each security type, three security levels are supported:

  • No encryption (RTCSECL_UNSUPPORTED).
  • Encrypt only if the remote application requires encryption (RTCSECL_SUPPORTED). The session is always allowed when this encryption level is specified.
  • Encryption required (RTCSECL_REQUIRED). If the remote application does not support encryption, then the session is not allowed.

If the application does not specify the security level for a particular media type, the RTC Client API will default to the RTCSECL_SUPPORTED security level.

Before an application using the RTC Client API receives notification of an incoming session, the security level of the incoming session is checked against the application's required security level. If the security levels don't match, then the RTC Client API will automatically decline the session without notifying the application. This will happen only if one party has specified the RTCSECL_REQUIRED security type and the other party has specified the RTCSECL_UNSUPPORTED security type for the session.

Authorization

The RTC Client API application can control access to the presence information through authorization rules stored and enforced either by the client (if the client is acting as a PA) or by the server (if the server is acting as a PAS). These authorization rules can be created and modified by creating RTCWatcher objects using the RTC Client API. The application can choose to store these authorization rules on the server by enabling roaming for watcher objects. This way, the rules are available to users regardless of which device they log on to.

When the RTC Client API application enables roaming of watchers, the RTC Client API creates two roaming sessions with the client: one to track pending watchers that have not yet been accepted by the user (WPending Roaming), and another to notify the application of current watchers (Watcher Roaming). WPending Roaming and Watcher Roaming are both enabled when the user logs on to the server by calling the IRTCClientProvisioning2::EnableProfileEx method with the RTCRMF_WATCHER_ROAMING flag specified in the lRoamingFlags parameter.

WPending notifies client applications that new watchers want to subscribe to their presence information. The client application may allow, block, or deny the new watcher's access to the user's presence information. If the user is offline when the watcher attempts to access the user's information, the server holds the watcher's subscription until the user logs on to the server with Watcher Roaming enabled. The user then approves, denies, or blocks the watcher.

The application can also create some authorization rules to control which incoming sessions are notified. The PrivacyMode property on the IRTCClientPresence object can block all incoming sessions from users who are in the Block list, or to block incoming sessions from users who are not in the Allow list.

Programming with the RTC Client API

The RTC Client API COM-based object model can be accessed by C++ or Visual Basic 6.0. It can also be accessed from Visual C# and Visual Basic .NET through COM interoperability. The RTC Client API was designed to take advantage of the facilities in Windows XP and Windows Server 2003; some of its features are not available on older operating systems.

Object Model Overview

The Real-time Communications Client API consists of six main objects: IRTCClient, IRTCSession, IRTCParticipant, IRTCProfile, IRTCBuddy, and IRTCWatcher.

  • IRTCClient. This object is at the root of the RTC Client API. It determines the session types that are allowed and identifies preferred audio and video devices and media types. It is also used to create or access the other objects in the API.
  • IRTCSession. Represents a real-time session. It supports all session types, including PC-to-PC, PC-to-phone, phone-to-phone, and instant messaging. It also stores information about media types available for this particular session. Each of the other users that are part of this session is represented by a participant object.
  • IRTCParticipant. Contains information such as the participant's name, URI, and current state.
  • IRTCProfile. Provides a way to get information from a profile.
  • IRTCBuddy. Provides a way to get and put information about other registered users on the SIP server.
  • IRTCWatcher. Provides a way to get and put information about how others can see presence information regarding the current client.

Initializing the RTC Client

Initializing the RTC Client object is merely a matter of calling CoCreateInstance with the appropriate parameters to create the RTCClient object, and then calling the Initialize method to properly initialize it. Before creating the RTCClient object, you should ensure that the application is properly initialized for apartment threading, including the necessary Windows message pump. This is illustrated in the following code:

    // Create the RTC client
    HRESULT hr;

    hr = CoCreateInstance(
        __uuidof(RTCClient),
        NULL,
        CLSCTX_INPROC_SERVER,
        __uuidof(IRTCClient2),
        (LPVOID *)&m_pClient
        );

    if (FAILED(hr))
    {
        // CoCreateInstance failed
        DEBUG_PRINT(("CoCreateInstance failed %x", hr ));
        ShowMessageBox(L"RTC Client v1.2 or higher required!");
        return -1;
    }

    // Initialize the RTC client
    hr = m_pClient->Initialize();

    if (FAILED(hr))
    {
        // Initialize failed
        DEBUG_PRINT(("Initialize failed %x", hr ));
        SAFE_RELEASE(m_pClient);
        return -1;
    }

If you wish to disable media streams or Universal Plug and Play (UPnP) on startup, you can call the InitializeEx method with the parameter RTCIF_DISABLE_MEDIA or RTCIF_DISABLE_UPNP.

Note that there are two main interfaces for the client object: IRTCClient and IRTCClient2. IRTCClient2 inherits all of the original properties and methods from IRTCClient, while extending the interface with the new properties and methods associated with previous version of the RTC Client API.

After the RTCClient object has been initialized, the application can listen for incoming session requests from other peer computers. For example, calling put_AllowedPorts specifies which ports will be used for incoming sessions. Then calling put_AnswerMode determines which types of sessions will be accepted.

    //Open TCP port 5060 for incoming session
    hr = pClient->put_AllowedPorts(RTCTR_TCP, RTCLM_BOTH);
    if(FAILED(hr))
    {
        DEBUG_PRINT(("put_AllowedPorts(RTCTR_TCP, RTCLM_BOTH) failed to open TCP ports, hr=0x%x", hr));
        return hr;
    }
    DEBUG_PRINT(("put_AllowedPorts(RTCTR_TCP, RTCLM_BOTH) succeeded!\n"));

    hr = pClient->put_AnswerMode(RTCST_MULTIPARTY_IM, RTCAM_OFFER_SESSION_EVENT);
    if(FAILED(hr))
    {
        DEBUG_PRINT(("put_AnswerMode(RTCST_MULTIPARTY_IM) failed:%x", hr));
        return hr;
    }

    hr = pClient->put_AnswerMode(RTCST_PC_TO_PC, RTCAM_OFFER_SESSION_EVENT);
    if(FAILED(hr))
    {
        DEBUG_PRINT(("put_AnswerMode(RTCST_PC_TO_PC) failed:%x", hr));
        return hr;
    }

Listening on RTC Events

Once the RTCClient object has been created, the EventFilter property should be initialized with the events that should be handled. The following code fragment initializes the event filter with a subset of the available events. If all events will be handled, the constant RTCEF_ALL should be specified. However, it is probably better to determine which events the application really needs and explicitly identify them here so that the RTC Client API can ignore the rest. The complete list of events and their meaning can be found in the RTC Client API SDK Help files.

    // Determine the event filter
    long lFlags = RTCEF_REGISTRATION_STATE_CHANGE |
                  RTCEF_SESSION_STATE_CHANGE |
                  RTCEF_PARTICIPANT_STATE_CHANGE |
                  RTCEF_MESSAGING |
                  RTCEF_MEDIA |
                  RTCEF_INTENSITY |
                  RTCEF_CLIENT |
                  RTCEF_BUDDY |
                  RTCEF_BUDDY2 |
                  RTCEF_WATCHER |
                  RTCEF_WATCHER2 |
                  RTCEF_GROUP |
                  RTCEF_USERSEARCH |
                  RTCEF_ROAMING |
                  RTCEF_PROFILE |
                  RTCEF_PRESENCE_PROPERTY |
                  RTCEF_PRESENCE_DATA | 
                  RTCEF_PRESENCE_STATUS;

    // Set the event filter for the RTC client
    hr = m_pClient->put_EventFilter(lFlags);

    if ( FAILED(hr) )
    {
        // put_EventFilter failed
        DEBUG_PRINT(("put_EventFilter failed %x", hr ));
        SAFE_RELEASE(m_pClient);
        return -1;
    }

Once the event filter has been specified, the event handler needs to be registered. This is accomplished by creating a new event sink using the CRTCEvents class, as shown in the following code fragment:

    // Create the event sink object
    m_pEvents = new CRTCEvents;

    if (!m_pEvents)
    {
        // Out of memory
        DEBUG_PRINT(("Out of memory"));
        SAFE_RELEASE(m_pClient);
        return -1;
    }

    // Advise for events from the RTC client
    hr = m_pEvents->Advise( m_pClient, m_hWnd );

    if ( FAILED(hr) )
    {
        // Advise failed
        DEBUG_PRINT(("Advise failed %x", hr ));
        SAFE_RELEASE(m_pClient);
        return -1;
    }

Each event specified in the event filter has its own case clause that contains the specific code to determine the proper interface and then passes control on to the event handler for processing. The actual events are handled with code similar to the following fragment:

LRESULT CRTCWin::OnRTCEvent(UINT uMsg, WPARAM wParam, LPARAM lParam)
{
    IDispatch * pDisp = (IDispatch *)lParam;
    RTC_EVENT enEvent = (RTC_EVENT)wParam;
    HRESULT hr;

    // Based on the RTC_EVENT type, query for the 
    // appropriate event interface and call a helper
    // method to handle the event

    switch ( wParam )
    {
        case RTCE_REGISTRATION_STATE_CHANGE:
            {
                IRTCRegistrationStateChangeEvent * pEvent = NULL;
                
                hr = pDisp->QueryInterface( __uuidof(IRTCRegistrationStateChangeEvent),
                                            (void **)&pEvent );

                if (SUCCEEDED(hr))
                {
                    OnRTCRegistrationStateChangeEvent(pEvent);
                    SAFE_RELEASE(pEvent);
                }              
            }
            break;

//
// multiple case clauses omitted
//

    }

    // Release the event
    SAFE_RELEASE(pDisp);

    return 0;
}

Creating a Profile

A profile must be supplied in order to log on to the SIP server. This profile is an XML-formatted document containing the location of the SIP server and the desired transport protocol. It contains all of the information necessary to log on to the SIP server, as well as the user's URI, account, and password. The following code fragment begins by querying the RTCClient object to return a pointer to the IRTCClientProvisioning2 interface:

    // Get the RTC client provisioning interface
    IRTCClientProvisioning2 * pProv = NULL;

    hr = m_pClient->QueryInterface(
            __uuidof(IRTCClientProvisioning2),
            (void **)&pProv);

    if (FAILED(hr))
    {
        // QueryInterface failed
        DEBUG_PRINT(("QueryInterface failed %x", hr ));
        return hr;
    }

    // Get the profile
    hr = pProv->GetProfile(
            NULL,           // bstrUserAccount
            NULL,           // bstrUserPassword
            bstrURI,        // bstrUserURI
            bstrServer,     // bstrServer
            lTransport,     // lTransport
            0               // lCookie
            );

    SAFE_RELEASE(pProv);

    if (FAILED(hr))
    {
        // GetProfile failed
        DEBUG_PRINT(("GetProfile failed %x", hr ));
        return hr;    
    }

The GetProfile method generates the XML profile document based on the supplied parameters. All of the parameters, except for the user's URI, may be omitted from the call to GetProfile. If only the URI is specified, the server will be located by using its domain information. The RTC client will attempt to locate the proper transport protocol by checking the DNS SRV entries for the server. It checks TLS first, followed by TCP, and then User Datagram Protocol (UDP). It will use the first protocol found. When the user's account and password are omitted in the call to GetProfile( ), NT Logon credentials can then be used for authentication. However, the application needs to have specified it by calling IRTCProfile::put_AllowedAuth(long lAllowedAuth) with RTCAU_USE_LOGON_CRED and at least one other authentication type, such as NTLM or Kerberos as parameters.

If the supplied credentials fail, either because they were omitted or were incorrect, a registration state change event will be fired with a registration state of REJECTED/ERROR. Users can then be prompted for their credentials, and the SetCredentials method can be used to update the profile, as in the following code fragment:

    // Update the credentials in the profile
    hr = pProfile->SetCredentials(bstrURI, bstrAccount, bstrPassword);

    if (FAILED(hr))
    {
        // SetCredentials failed
        DEBUG_PRINT(("SetCredentials failed %x", hr ));
        return hr;
    }

The registration process can then be repeated. Assuming that the user supplied the proper account and password, they will be logged on to the SIP server.

Enabling a Profile for Roaming

Roaming is enabled when information about a user's contacts, contact groups, and watchers is stored on the server. This means that the user is free to roam from one computer to another and still have access to this information.

For the most part, the RTC Client API is independent of the user's roaming status. However, when a profile is used to log on to the SIP Server, the application can specify that the profile be stored on the server or kept locally through roaming flags.

The profile is enabled by calling the EnableProfileEx method. This method specifies the profile to be enabled (pProfile), how the client will be registered on the server (lRegisterFlags), and which objects will have roaming enabled (lRoamingFlags). The following code fragment illustrates a call to EnableProfileEx:

    // Enable the RTC profile object
    hr = pProv->EnableProfileEx(pProfile, lRegisterFlags, lRoamingFlags);

    SAFE_RELEASE(pProv);

    if (FAILED(hr))
    {
        // EnableProfile failed
        DEBUG_PRINT(("EnableProfileEx failed %x", hr ));
        return hr;    
    }

If lRoamingFlags is zero, then roaming is disabled. Otherwise, roaming can be enabled in any combination for:

  • Contacts (RTCRMF_BUDDY_ROAMING)
  • Presence information (RTCRMF_PRESENCE_ROAMING)
  • Profile (RTCRMF_PROFILE _ROAMING)
  • Watchers (RTCRMF_WATCHERS_ROAMING)

Specifying RTCRMF_ALL_ROAMING for lRoamingFlags enables all of the roaming options.

Setting Presence

Before the profile can be registered with the SIP server, the presence information must be set. This involves using the RTCClient object to create an IRTCClientPresence2 object. Then a call to EnablePresenceEx activates the profile contained in pProfile. The varStorage variable contains the name of a local file that will hold the presence information. This is shown in the following code fragment:

    // Enable presence
    hr = pPresence->EnablePresenceEx(pProfile, varStorage, 0);    
    VariantClear(&varStorage);

    if (FAILED(hr))
    {
        // EnablePresence failed
        DEBUG_PRINT(("EnablePresence failed %x", hr ));
        SAFE_RELEASE(pPresence);
        return hr;
    }

Starting a Session

Once the user has been registered with the SIP server, they are able to establish a session with another user. This begins by creating a new IRTCSession object using the CreateSession method from RTCClient. This method allows you to specify the type of session (enType), the TEL URI for phone-to-phone sessions or NULL for all other session types, the profile that should be used or NULL to instruct the API to determine the profile, and option flags. The following code fragment shows how to create a new session object.

    IRTCSession * pSession = NULL;

    // Create a PC-to-PC session
    RTC_SESSION_TYPE enType = RTCST_PC_TO_PC;  

    hr = m_pClient->CreateSession(
        enType,
        NULL,
        NULL,
        0,
        &pSession
        );

    if (FAILED(hr))
    {
        // CreateSession failed
        DEBUG_PRINT(("CreateSession failed %x", hr ));
        return hr;
    }

    // Add the participant to the session
    hr = pSession->AddParticipant(
        bstrURI,
        bstrName,
        NULL
        );

    if (FAILED(hr))
    {
        // AddParticipant failed
        DEBUG_PRINT(("AddParticipant failed %x", hr ));
        SAFE_RELEASE(pSession);
        return hr;
    }

Once the session object is created, the AddParticipant method is used to add another client to the current session. This bstrURI value contains the location of the client and can be a SIP or TEL URI, an e-mail address, an IP address, or a DNS name. The bstrName value contains the displayable name of the client. Adding a participant to an idle session will initiate a call to the participant, who can accept or reject it.

Securing a Session

In order for a session to use encryption, the application must specify the preferred security level before the session is created. This value is set on the IRTCSession2 object by using the put_PreferredSecurityLevel property. This property takes two parameters, enSecurityType and enSecurityLevel.

The enSecurityType parameter can be either RTCSECT_AUDIO_VIDEO_MEDIA_ENCRYPTION, which secures audio and video transmissions, or RTCSECT_T120_MEDIA_ENCRYPTION, which encrypts the transmissions associated with application sharing.

Also, enSecurityLevel can be RTCSECL_UNSUPPORTED if session security is not supported, RTCSECL_SUPPORTED if session security is optional for a session, and RTCSECL_REQUIRED if all sessions must be encrypted.

The following code fragment shows how to enable security for audio and video sessions:

    hr = m_pSession->put_PreferredSecurityLevel(RTCSECT_AUDIO_VIDEO_MEDIA_ENCRYPTION, RTCSECL_REQUIRED); 

    if ( FAILED(hr) )
    {
        // put_PreferredSecurityLevel failed
        DEBUG_PRINT(("put_PreferredSecurityLevel failed %x", hr ));
        SAFE_RELEASE(m_pPsession);
        return -1;
    }

Answering a Call

When an incoming call arrives, the RTCSessionStateChangeEvent is fired. Because this event can be fired for several different reasons, get_State is called to determine the current session state. Then get_Session is called to get the session object associated with this event, as illustrated in the following code fragment:

    hr = pEvent->get_State(&enState);

    if (FAILED(hr))
    {
        // get_State failed
        return;
    }

    hr = pEvent->get_Session(&pSession);

    if (FAILED(hr))
    {
        // get_Session failed
        return;
    }

The next code fragment shows how to respond to an incoming call. If the session state is RTCSS_INCOMING, then the event was triggered by an incoming call. The type of session requested can be determined by using get_Type. Because only one audio and visual call is permitted at a time, the Terminate method is used to indicate that the client is busy and cannot accept this call.

    if (enState == RTCSS_INCOMING)
    {
        // This is a new session
        RTC_SESSION_TYPE enType;

        hr = pSession->get_Type(&enType);

        if (FAILED(hr))
        {
            // get_Type failed
            SAFE_RELEASE(pSession);
            return;
        }            

        if (enType == RTCST_PC_TO_PC || enType == RTCST_PC_TO_PHONE)
        {
            // This is an AV call
            if (CRTCAVSession::m_Singleton != NULL)
            {
                // If another AV call is in progress, then
                // we are already busy.
                pSession->Terminate(RTCTR_BUSY);

                SAFE_RELEASE(pSession);
                return;
            }
        }
    }

To find the individual that placed the call, the EnumerateParticipants method is used. The get_UserURI and get_Name methods can be used to gather information about this person. This information can then be displayed, and the user can determine whether to accept the call. The following code fragment illustrates the call to EnumerateParticipants, as well as to get_UserURI and get_Name:

    // Get the participant object
    IRTCEnumParticipants * pEnum = NULL;
    IRTCParticipant * pParticipant = NULL;

    hr = pSession->EnumerateParticipants(&pEnum);

    if (FAILED(hr))
    {
        // EnumerateParticipants failed
        SAFE_RELEASE(pSession);
        return;
    }

    hr = pEnum->Next(1, &pParticipant, NULL);

    SAFE_RELEASE(pEnum);

    if (hr != S_OK)
    {
        // Next failed
        SAFE_RELEASE(pSession);
        return;
    }

    // Get the participant URI
    BSTR bstrURI = NULL;

    hr = pParticipant->get_UserURI(&bstrURI);

    if (FAILED(hr))
    {
        // get_UserURI failed
        SAFE_RELEASE(pSession);
        SAFE_RELEASE(pParticipant);
        return;
    }

    // Get the participant name
    BSTR bstrName = NULL;

    hr = pParticipant->get_Name(&bstrName);

    SAFE_RELEASE(pParticipant);

    if (FAILED(hr) && (hr != RTC_E_NOT_EXIST))
    {
        // get_Name failed
        SAFE_FREE_STRING(bstrURI);
        SAFE_RELEASE(pSession);
        return;
    }

If the user chooses to accept the call, the Accept method is used. If the user chooses to reject the call, the Terminate method is used, with the RTCTR_REJECT value. Calls to the Accept and Terminate methods are shown in the following code fragment:

    if (fAccept)
    {
        // Accept the session
        hr = pSession->Answer();

        if (FAILED(hr))
        {
            // Answer failed
            SAFE_RELEASE(pSession);
            return;
        }
    }
    else
    {
        // Reject the session
        pSession->Terminate(RTCTR_REJECT);
        SAFE_RELEASE(pSession);
        return;
    }

Setting Presence Status

A real-time client is either online or offline. However, the RTC Client API allows eight different levels of presence information. The extra six levels indicate that the user is online but may not immediately respond to a message.

Presence is set by using the IRTCClientPresence interface and its SetLocalPresenceInfo method. This method takes two parameters. The first, enStatus, can be one of the following enumerated values:

  • RTCXS_PRESENCE_OFFLINE
  • RTCXS_PRESENCE_ONLINE
  • RTCXS_PRESENCE_AWAY
  • RTCXS_PRESENCE_IDLE
  • RTCXS_PRESENCE_BUSY
  • RTCXS_PRESENCE_BE_RIGHT_BACK
  • RTCXS_PRESENCE_ON_THE_PHONE
  • RTCXS_PRESENCE_OUT_TO_LUNCH

The second parameter, bstrNotes, contains a text message that will be returned to anyone watching this user. The following code illustrates a call to SetLocalPresenceInfo:

   IRTCClientPresence * pPresence = NULL;

    hr = m_pClient->QueryInterface(
            __uuidof(IRTCClientPresence),
            (void **)&pPresence);

    if (FAILED(hr))
    {
        // QueryInterface failed
        DEBUG_PRINT(("QueryInterface failed %x", hr ));
        return hr;
    }

    // Set the local presence status
    hr = pPresence->SetLocalPresenceInfo(enStatus, bstrNotes);

    SAFE_RELEASE(pPresence);
    SAFE_FREE_STRING(bstrNotes);

    if (FAILED(hr))
    {
        // SetLocalPresenceInfo failed
        DEBUG_PRINT(("SetLocalPresenceInfo failed %x", hr ));
        return hr;
    }

The presence information set by the SetLocalPresenceInfo method can be retrieved by a remote client using the get_Status and get_Notes properties in IRTCBuddy. For example:

    // Get the contact's status
    RTC_PRESENCE_STATUS enStatus;

    hr = pBuddy->get_Status(&enStatus);

    if (FAILED(hr))
    {
        // get_Status failed
        DEBUG_PRINT(("get_Status failed %x", hr));
    }

Instant Messenging Sessions

An IM session is started by creating a new session object and specifying a session type of IM. Once the call is accepted, the code to actually drive the process revolves around the developer-supplied user interface.

The code to handle the IM session is divided into two parts. The first part handles messages originated by the local user, while the second part handles messages received from the remote user.

Sending a message to the remote user involves accepting the message from the user interface and then using the SendMessage method from the Session object to transmit the message. Depending on how the user interface is constructed, the message may need to be echoed. The following code fragment illustrates a call to SendMessage:

    // Send the outgoing message
    hr = m_pSession->SendMessage(NULL, bstrMessage, 0);

    SAFE_FREE_STRING(bstrMessage);

    if (FAILED(hr))
    {
        // SendMessage failed
        return hr;
    }

Messages are received through the IRTCMessagingEvent interface. The following code fragment assumes that pEvent contains a pointer to the event object and that pParticipant already contains the participant object associated with the message sender. Both get_MessageHeader and get_Message get information about the message from that stored in the event. These values, along with pParticipant, are passed to a user interface routine that will update the display.

    hr = pEvent->get_MessageHeader(&bstrContentType);

    if (FAILED(hr))
    {
        // get_MessageHeader failed
        SAFE_RELEASE(pParticipant);
        return;
    }

    hr = pEvent->get_Message(&bstrMessage);

    if (FAILED(hr))
    {
        // get_Message failed
        SAFE_RELEASE(pParticipant);
        SAFE_FREE_STRING(bstrContentType);
        return;
    }

    // Deliver the message to the session window
    pSessWindow->DeliverMessage(pParticipant, bstrContentType, bstrMessage);

    SAFE_FREE_STRING(bstrContentType);
    SAFE_FREE_STRING(bstrMessage);

Multimedia Sessions

Multimedia sessions are somewhat more complex to implement than IM sessions, but the power of transmitting audio and video can really enhance the entire real-time experience. The user interface for this process requires only a place to display the video and adjust the speaker and microphone levels.

The session is initiated by placing a call to another client, while specifying PC-to-PC as the session type. When the call is accepted by the other client, an IRTCSessionStateChangeEvent2 event will occur. The audio and video information are handled independently of each other. The code to handle the audio stream is somewhat more complex than the video stream, because the user can control the volume of the microphone and speaker.

While processing the event, the application should use get_MediaCapabilities to determine if the video should be displayed. Then get_IVideoWindow can be used to get the IVideoWindow associated with the device. If the video is to be displayed, then the appropriate properties are set in the IVideoWindow object and the video stream is made visible. This is illustrated in the following code:

    // Get the media capabilities
    hr = pClient->get_MediaCapabilities(&lMediaCaps);

    if (FAILED(hr))
    {
        // get_MediaCapabilities failed
        SAFE_RELEASE(pClient);
        return hr;
    }

    hr = pClient->get_IVideoWindow(
        enDevice, &pVid);

    SAFE_RELEASE(pClient);

    if (FAILED(hr))
    {
        // get_IVideoWindow failed
        return hr;
    }

    // Determine whether to show the receive video        
    fShow = fShow && (lMediaCaps & RTCMT_VIDEO_RECEIVE);
    m_fShowRecv = fShow;
    hWnd = m_hRecvVideoParent;

    if (fShow == TRUE)
    {
        // Set the video window style
        pVid->put_WindowStyle(WS_CHILD |
                              WS_CLIPCHILDREN |
                              WS_CLIPSIBLINGS);
        
        // Set the parent window for the video window
        pVid->put_Owner((OAHWND)hWnd);

        RECT rc;
        GetClientRect(hWnd, &rc);  
        
        // Position the video window
        pVid->SetWindowPosition(
            rc.left,
            rc.top,
            rc.right,
            rc.bottom
            );

        // Make the video window visible
        pVid->put_Visible(-1);        
    }       

    SAFE_RELEASE(pVid);  

Managing the Contact List

While it is possible to call another peer by specifying its URI, using contacts is a better alternative. The list of contacts can be retrieved by querying the RTCClient object for the IRTCClientPresence2 interface. From this interface, the list of contacts can be enumerated by calling the EnumerateBuddies method. For example:

    // Get the RTC client presence interface
    IRTCClientPresence * pPresence = NULL;

    hr = m_pClient->QueryInterface(
            __uuidof(IRTCClientPresence),
            (void **)&pPresence);

    if (FAILED(hr))
    {
        // QueryInterface failed
        DEBUG_PRINT(("QueryInterface failed %x", hr ));
        return hr;
    }

    // Enumerate buddies and populate list
    IRTCEnumBuddies * pEnum = NULL;
    IRTCBuddy * pBuddy = NULL;

    hr = pPresence->EnumerateBuddies(&pEnum);

    SAFE_RELEASE(pPresence);

    if (FAILED(hr))
    {
        // Enumerate buddies failed
        DEBUG_PRINT(("EnumerateBuddies failed %x", hr ));
        return hr;
    }

    while (pEnum->Next(1, &pBuddy, NULL) == S_OK)
    {
        // Update the contact list entry
        UpdateContactList(pBuddy);

        SAFE_RELEASE(pBuddy);
    }

    SAFE_RELEASE(pEnum);

The information can then be displayed with an application-specific routine called UpdateContactList. Once a particular contact has been found or selected, the desired information can be extracted from the appropriate property. In this fragment, the string containing an empty name is treated as an error, though this may not be true for all applications.

   hr = pBuddy->get_Name(&bstrName);

   if (SUCCEEDED(hr) && !wcscmp(bstrName, L""))
   {
       // Treat an emptry string as a failure
       SAFE_FREE_STRING(bstrName);
       hr = E_FAIL;
   }

Adding a contact to the list is just as easy. Use the AddBuddyEx method from the IRTCClientPresence2 interface and supply the contact's URI (bstrURI), display name (bstrName), and any optional data. If you specify VARIANT_TRUE for fPersistent, the contact information will be saved in persistent storage; otherwise, it will last only for the duration of the current session. The RTCBT_SUBSCRIBED value indicates that the user is allowed to subscribe to the contact's presence information. The following code, for example, adds a contact to the list and saves the information in persistent storage:

    // Add the contact
    IRTCBuddy2 * pBuddy = NULL;

    hr = pPresence->AddBuddyEx(
            bstrURI,
            bstrName,
            NULL,
            VARIANT_TRUE,
            NULL,
            RTCBT_SUBSCRIBED,
            &pBuddy
            );

    SAFE_RELEASE(pPresence);

Changes in the contact list are detected through an event. Information about the event can be found through the IRTCBuddyEvent2 interface, which is fired whenever a contact is changed. The types of changes detected include adding, changing, or deleting a contact. The event will also be fired if the contact's presence state has changed or if the contact has roamed.

Within the event handler, the type of event can be determined through the get_EventType property. The get_StatusCode property determines the contact's current status, while the get_Buddy property returns the object associated with the change. This information can be used to determine how the user interface will be updated. For example:

    hr = pEvent->get_EventType(&enType);

    if (FAILED(hr))
    {
        // get_EventType failed
        DEBUG_PRINT(("get_EventType failed %x", hr ));
        return;
    }

    // Get the status
    hr = pEvent->get_StatusCode(&lStatus);

    if (FAILED(hr))
    {
        // get_StatusCode failed
        DEBUG_PRINT(("get_StatusCode failed %x", hr ));
        return;
    }

    // Get the IRTCBuddy object
    IRTCBuddy * pBuddy = NULL;

    hr = pEvent->get_Buddy(&pBuddy);

    if (FAILED(hr))
    {
        // get_Buddy failed
        DEBUG_PRINT(("get_Buddy failed %x", hr ));
        return;
    }

    switch (enType)
    {
    case RTCBET_BUDDY_ADD:
        {
            DEBUG_PRINT(("RTCBET_BUDDY_ADD [%p] %x", pBuddy, lStatus ));  

            if (SUCCEEDED(lStatus))
            {
                // Update the contact list entry
                UpdateBuddyList(pBuddy);
            }
            else
            {
                // Delete the contact from the list
                ClearBuddyList(pBuddy);
            }
        }
        break;

//
// multiple case clauses omitted
//

    }

    SAFE_RELEASE(pBuddy);

Managing Contact Groups

Contact groups allow the user to organize contacts into collections. The user can have up to a maximum of 32 groups. The following code fragment shows how to use the IRTCClientPresence2 interface to determine the list of contact groups. After creating the IRTCClientPresence2 interface from the client object, the EnumerateGroups method is used to iterate through the list of groups for display by the user interface.

    hr = m_pClient->QueryInterface(
            __uuidof(IRTCClientPresence2),
            (void **)&pPresence);

    if (FAILED(hr))
    {
        // QueryInterface failed
        DEBUG_PRINT(("QueryInterface failed %x", hr ));
        return hr;
    }

    // Enumerate groups and populate list
    IRTCEnumGroups * pEnum = NULL;
    IRTCBuddyGroup * pGroup = NULL;

    hr = pPresence->EnumerateGroups(&pEnum);

    SAFE_RELEASE(pPresence);

    if (FAILED(hr))
    {
        // Enumerate groups failed
        DEBUG_PRINT(("EnumerateGroups failed %x", hr ));
        return hr;
    }

    while (pEnum->Next(1, &pGroup, NULL) == S_OK)
    {
        // Update the group list entry
        UpdateGroupList(pGroup);

        SAFE_RELEASE(pGroup);
    }

    SAFE_RELEASE(pEnum);

Groups can be added using the AddGroup method in the IRTCClientPresence2 interface. Likewise, a group can be removed from the user's profile by using the RemoveGroup method.

Once a particular group has been identified, the EnumerateBuddies method can be called to return a list of IRTCBuddy objects, which can be manipulated as described earlier. Since a particular contact may be a member of multiple groups, the EnumerateGroups method in the IRTCBuddy2 interface can be used to find each of the contact's groups.

Watchers

A watcher allows you to retrieve information about someone who has added you as a contact. The IRTCWatcher2 object can be created through the IRTClientPresence interface or through the IRTCWatcherEvent2 object. Using the watcher, you can allow or deny access to your presence information.

The following code changes the watcher's state to prompt on the local computer, which means that the user will be prompted each time a contact requests access to your presence information.

    hr = pWatcher->put_State(RTCWS_PROMPT);
    
    if (FAILED(hr))
    {
        // put_State failed
        DEBUG_PRINT(("put_State failed %x", hr ));
        return hr;
    }

When a contact attempts to determine the status of the local user, an IRTCWatcherEvent2 will occur, as shown in the next code fragment. After getting the event type and the status code, the IRTCWatcher object can be retrieved from the IRTCWatcherEvent2 object. As the remote contact attempts to determine the local user's information, the event type will be RTCWET_WATCHER_OFFERING. This event is also triggered when buddies add or remove the local user to their contact lists, when the watcher's properties have been updated, or when the watcher's presence information has roamed.

    hr = pEvent->get_EventType(&enType);

    if (FAILED(hr))
    {
        // get_EventType failed
        DEBUG_PRINT(("get_EventType failed %x", hr ));
        return;
    }

    // Get the status
    hr = pEvent->get_StatusCode(&lStatus);

    if (FAILED(hr))
    {
        // get_StatusCode failed
        DEBUG_PRINT(("get_StatusCode failed %x", hr ));
        return;
    }

    // Get the watcher object
    IRTCWatcher * pWatcher = NULL;

    hr = pEvent->get_Watcher(&pWatcher);

    if (FAILED(hr))
    {
        // get_Watcher failed
        DEBUG_PRINT(("get_Watcher failed %x", hr ));
        return;
    }

    switch (enType)
    {

//
// multiple case clauses omitted
//

    case RTCWET_WATCHER_OFFERING:
        {
            DEBUG_PRINT(("RTCWET_WATCHER_OFFERING [%p] %x", pWatcher, lStatus ));

            // Get the watcher URI
            BSTR bstrURI = NULL;

            hr = pWatcher->get_PresentityURI(&bstrURI);

            if (FAILED(hr))
            {
                // get_PresentityURI failed
                DEBUG_PRINT(("get_PresentityURI failed %x", hr ));
                SAFE_RELEASE(pWatcher);
                return;
            }

            // Get the watcher name
            BSTR bstrName = NULL;

            hr = pWatcher->get_Name(&bstrName);

            if (FAILED(hr) && (hr != RTC_E_NOT_EXIST))
            {
                // get_Name failed
                DEBUG_PRINT(("get_Name failed %x", hr ));
                SAFE_FREE_STRING(bstrURI);
                SAFE_RELEASE(pWatcher);
                return;
            }

            // Show the incoming watcher dialog
            BOOL fAllow, fAddBuddy;

            hr = ShowWatcherDialog(m_hWnd, bstrName, bstrURI, &fAllow, &fAddBuddy);    

            if (FAILED(hr))
            {
                // ShowWatcherDialog failed
                DEBUG_PRINT(("ShowWatcherDialog failed %x", hr ));
                SAFE_FREE_STRING(bstrURI);
                SAFE_FREE_STRING(bstrName);
                SAFE_RELEASE(pWatcher);
                return;
            }

            // Set the watcher to be allowed or blocked
            hr = pWatcher->put_State(fAllow ? RTCWS_ALLOWED : RTCWS_BLOCKED);

            if (FAILED(hr))
            {
                // put_State failed
                DEBUG_PRINT(("put_State failed %x", hr ));
                SAFE_FREE_STRING(bstrURI);
                SAFE_FREE_STRING(bstrName);
                SAFE_RELEASE(pWatcher);
                return;
            }


            SAFE_FREE_STRING(bstrURI);
            SAFE_FREE_STRING(bstrName);
        }
        break;
    }

    SAFE_RELEASE(pWatcher);

In this case, the information about the contact associated with the watcher is retrieved by using the get_PresentityURI and get_Name properties, and displayed to the local user in a dialog box. This dialog box prompts the user to choose whether to allow or deny access the contact's access to the local user's presence information. If the local user denies the request, the contact will know only that the local user is offline. Otherwise, the contact will be informed that the local user is online.

Programming Considerations

Here are some programming issues to be considered when building RTC client applications.

The RTC Client API Is a Side-by-Side Application

The RTC Client API is a side-by-side assembly that allows multiple versions of the API to exist on the system at the same time. To determine which DLL file should be used, a manifest file must be included in the same directory as the application using the RTC Client API. This manifest must have the same filename as the executable program, with .manifest appended to the end of the filename.

For example, if the application is named RTCSample.exe, the manifest file must be called RTCSample.exe.manifest. Failure to use the proper filename results in the application using the wrong version of the API. So if the application displays an error indicating that the wrong version of the API was used, the first place to check is the manifest file. Most likely, either the manifest file is not present or it is not properly named.

The manifest file is an XML file that looks similar to the following sample. The information in the <assemblyIdentity> node describes the client application program. For the most part, this information is not very critical, as long as it is correctly formatted.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!-- Copyright © 1981-2003 Microsoft® Corporation -->
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
<assemblyIdentity
    version="1.0.0.0"
    processorArchitecture="x86"
    name="Microsoft.Windows.Networking.RTCSample"
    type="win32"
/>

<description>RTC Sample</description>

<dependency>
    <dependentAssembly>
         <assemblyIdentity
             type="win32"
             name="Microsoft.Windows.Networking.RtcDll"
             version="5.2.2.1"
             processorArchitecture="X86"
             publicKeyToken="6595b64144ccf1df"
             language="*"
        />
    </dependentAssembly>
</dependency>

</assembly>

The information in the <dependentAssembly> node, on the other hand, is critical and must be entered exactly as shown. It refers to version 1.2 of the RTC Client API run-time library.

RTC Client API Version 1.2 Issues

Version 1.2 of the RTC Client API introduces a large number of new interfaces, and several extensions of older interfaces. These are identified by the number 2 at the end of the interface's name (for example, IRTCClient2). Because these new interfaces inherit all of the properties and methods associated with the originals, it is safe to use the new interfaces in place of the original ones.

Conclusion

The Microsoft RTC Client API provides the foundation for building applications that require communications. The RTC Client API includes capabilities required for secure applications, such as the ability to authenticate a user against a Kerberos security server and the ability to use encrypted transmissions to and from the SIP server. The RTC Client API even supports communications between computers and telephones, provided that the appropriate hardware is in place.

Other advantages of the RTC Client API include the ability to identify groups of contacts, to automatically watch them to determine when they are online, and to roam from one device to another while continuing to watch the contacts. Using the RTC Client API, developers can create numerous real-time communications applications to fit any business need.

See the following resources for further information: