Chapter 13

More About ActiveX


Microsoft has combined and enhanced its OLE and OCX technologies and renamed the consolidated standard ActiveX. ActiveX refines the OLE specification for OLE controls, which makes them smaller and more efficient. New OLE interfaces are also defined enhancing control over data and property management. Using the the new ActiveX Control class, generated controls are lightweight and Internet-aware.

Microsoft has recently turned ActiveX over to a steering committee which will oversee the development of the ActiveX standards. You can get more information by opening in your Web browser.

The 3.2 Component Object Model

The Component Object Model (COM) is a Client/Server object based model designed to allow software components and applications to interact with each other in a uniform standard way.

The COM standard is partly a specification and partly an implementation. The specification defines mechanisms for object creation and communication between objects. This part of the specification is paper-based and is not dependent on any particular language or operating system. Any language can be used as long as you adhere to the standard.

The implementation part is the COM library, which provides a number of core services to support the binary specification of COM.

A Simple View of How COM Works

When a client COM object wants to use the services of a server COM object, it uses one of the core services in the COM library. The COM library is responsible for creating the server COM object and establishing the initial connection between the client and server. The connection is made when the server returns a pointer to the client. The pointer points to an Interface in the server object. From this point forward, the COM library plays no further part in the process. The two objects are free to communicate with each other directly.

Objects communicate through Interfaces. Interfaces are small sets of related functions that provide some sort of service. An object may have more than one interface. When a client object has a pointer to a server object Interface, the client may invoke any function available through the Interface.

When an object is finished using the services provided by another object, the client informs the server that it is finished and terminates communication.

Please note that software objects may be client objects, server objects, or both.

What Is an Interface?

COM Interfaces are discrete sets of logically or semantically related functions. Interfaces are used to supply a service from a server object to a client object. Client objects never see the internal representation of the server object. An Interface can be thought of as a type of contract between two software components. The contract states that the server supplies the client with one type of service and nothing more.

In a complex system, human readable name clashes are a fact of life. In order to avoid these clashes, Interfaces are given a unique name. It is called a Globally Unique Identifier (GUID). This is a 128-bit number that is almost guaranteed to be unique. (According to the official COM specification, 10 million GUIDs could be generated every second for the next 3,500 years or so and every single one would be unique!) GUIDs are also used in other parts of COM and OLE to assign unique names. Interfaces may also be given human readable names but these are locally scoped to a single machine.

Server objects that supply more than one service implement an Interface for each service they supply. A client may invoke functions in Interfaces only on the server objects for which they have a pointer. They may obtain pointers to other Interfaces through the IUnknown interface. This is a fundamental Interface that all COM objects must support. This Interface has a function called QueryInterface. Since all Interfaces are derived from IUnknown, this function is present in every Interface. This function knows about every Interface in the server object and can give a client a pointer to any of the Interfaces to which it requests access. A client may not know about every interface in the server. The server may have other Interfaces that a client does not know about. This in no way compromises a client's ability to use a server. It uses the interfaces that it knows.

When a client is finished using a server, it informs the server that it is finished. This allows server objects to release themselves from memory when they are no longer servicing any clients.

Advantages of Interfaces

Interfaces allow objects to evolve independently over time. The definition and functionality of an Interface is never changed. If the functionality of an Interface changes or new functionality is added to the object, then a new Interface is added. This allows a server object to continue to be used by clients who know only about the old Interfaces. New clients who know about the new interfaces can use them as well as the old Interfaces.

Interfaces allow objects to be replaced by better objects from a different vendor as long as the Interface definitions do not change.

Interfaces are language independent. Any language that can create structures of pointers, and either explicitly or implicitly call a function through a pointer, can implement COM interfaces-languages such as C, C++, Smalltalk, and Pascal.

Types of COM Objects

There are three different types of COM server objects:

COM is designed such that regardless of where a server object is running, a client object always communicates with it in the same way. There is one single programming model for all types of objects. A client object accesses the services of the server object through a pointer to an Interface on the object. If the server is running in process, then the pointer accesses the Interface directly. If the server is a local or a remote server, then the pointer accesses a proxy server running in the same process as the client. This proxy is supplied by COM. Its purpose is to generate a call to the appropriate server, either local or remote.

At the other end, a stub object supplied by COM receives the call and turns it into a call on the Interface. Both client and server therefore always communicate with some piece of in-process code.

Foundation COM Components

Although COM is fundamentally concerned with object creation and communication, it provides the following other system level objects based on the fundamentals:

Figure 13.1 : COM is built in progressively higher layers of technology.

The object management services and the foundation COM components form the bedrock of information management. Microsoft's OLE 2.0 technology is built on this bedrock.

Examining OLE and ActiveX Controls

OCX controls are the standard solution for Windows component software. They are implemented using OLE 2.0 technology and are designed for use on the desktop environment. Most OCX controls today are built using the Control Development Kit, which is supplied as an integrated part of Microsoft Visual C++ version 4.x. The controls built using this kit are excellent for use in the desktop environment. Some of these controls are even able to make the leap to the Internet environment without any modification. However, most will need to be modified to operate more efficiently and cooperatively.

Controls developed with Visual C++ are built using the standard OLE 2.0 interfaces, some of which were mandatory. This means they contain a lot of unnecessary code. They are also dependent on the Microsoft Foundation Classes (MFC) libraries which are several megabytes in size. These controls are therefore relatively big in size which may limit their utility in the slow Internet environment. Also, in order to use these controls on the Internet, the user must first have the MFC libraries on their machine. A one-time download of these libraries for all controls to share may not be too much of a penalty, but MFC is being revised and released approximately every three months. This means users of these controls must download these new libraries every three months or their controls may not work.

Management of data and properties is another potential problem. OCX controls on the desktop operate synchronously. Function calls made to the OLE libraries or a control container do not return until they have completed. While this is not a problem on the desktop, where data and properties are stored locally in files and can be retrieved quickly, it causes problems in a slow environment like the Internet, where large amounts of data have to be retrieved from a remote site and loaded into the control. A 24-bit bitmap file for instance, can be several megabytes in size and take many minutes to download. It would be unacceptable for the user's browser software to freeze during this process.


Microsoft has consolidated all its OLE and OCX technologies under the heading of ActiveX. ActiveX defines a new specification for OLE controls which allows them to be much smaller and more efficient. New OLE interfaces are also specified that address the problem of data and property management. Controls that are built using the new ActiveX lightweight control class are smaller than their Visual C++ control wizard generated counterparts, and they can use the new interfaces to function efficiently and cooperatively with control containers in the Internet environment.

The OLE Control and Control Container Guidelines V2.0 defines a control as a COM software component that is self-registering, and implements the IUnknown interface. In order to support self-registration, the control must export the DLLRegisterServer and DLLUnRegisterServer function calls. All the OLE interfaces that were previously mandatory are now optional. Controls are free to implement as many or as few of the standard interfaces as they require, which leads to the first question about ActiveX controls. Previously, a control container could depend on functionality being present in the control because of the mandatory OLE interfaces. If a control implements only the IUnknown interface, how does a control container, such as a browser or authoring tool, know or find out what a control's functionality is? The answer is Component categories.

Examining Component Categories

Component categories describe different prescribed areas of functionality. Each component category is identified by a Globally Unique Identifier and each defines a set of standards that a control must meet in order to be a part of that category. Component categories are stored as entries in the system registry with GUIDs and human-readable keys.

Previously, when a control was registered on a client machine, it also registered the keyword Control under its CLSID. This keyword advertised the control's suitability for insertion into container applications such as Access and Visual Basic. The Control keyword is now obsolete, but it remains for the benefit of older applications that do not understand component categories.

Component categories are a natural extension of this process. They allow a control to describe its functionality in far more detail than plain OLE interface signatures. When a control self-registers in the system registry, it adds entries under its CLSID for the GUID for a control, the GUID for each category that it supports, and the GUID of each category that it requires support for from a container in order to function properly. Additionally, it registers its own CLSID under each category registry entry (see Figure 13.2).

Figure 13.2 : Component categories are registered.

Until now, when an application wanted to find out whether a particular control supported a piece of functionality, the application had to instantiate the control and use QueryInterface. If a valid pointer to a new interface was returned, then the application knew that the control supported the desired functionality. This is a very expensive and cumbersome operation.

Using categories and new OLE interfaces in the OLE libraries that allow categories to be registered and unregistered, enumerated, and queried, means that an application does not have to instantiate a control anymore. It can get information about controls from the system registry through these new interfaces in one of two ways. If a control's CLSID is known to the application, then the application can retrieve the category GUIDs under the control's registry entry to find out the functionality of the control. If a specific area of functionality is required, then the application can go to the registry entry for the category and retrieve a list of the controls on the machine that have registered support for that functionality. It can then go to each control's registry entry and determine whether it can host the control. The list can then be presented to the user of the application via an application or system user interface, and the user can choose which control to use.

Management of Data and Properties

The major difference between controls designed for use on the desktop and controls that are Internet-aware is the management of data and properties. A control may have any or all of the types of data listed in the following table which need to be stored so that they can be easily retrieved by a control container when it re-creates the control. These types do not imply any form of structure or storage location. A control's properties and BLOB data collectively make up its state.

Data TypeSize Purpose
Class Identifier16 bytes The CLSID of the control class that can read the(CLSID) data that follows
PropertiesAround 10K-30K Standard and custom property values
Binary LargeArbitrary size Any number of large binary files. These files may Objects (BLOB) be in any format. (for example, bitmaps, multimedia files, and so on)

If a control has no persistent state, then none of the above are present in an HTML document. The control container CLSID retrieves the CLSID of the control class directly from the attribute of the HTML <OBJECT> tag or indirectly from the CODE attribute. The control can then be instantiated and no further initialization is required.

When the user of an application that is hosting a control gives the command to save, the control container calls QueryInterface on the control for a persistent storage interface and the control serializes its state through it. Similarly, when a control is re-created, it retrieves its state through a persistent storage interface. Where the application stores the control's state is up to the user of the application that is hosting the control. The control is not concerned. It may be embedded within the HTML document or in a separate file that is linked to the HTML document. This linking and embedding mechanism is familiar territory to anyone with knowledge of OLE compound documents.

Although the control is not concerned with the actual storage of its state, it is concerned with the interfaces through which its state is saved and retrieved. One of the goals Microsoft had when creating the ActiveX specification was to introduce as little new technology as possible. However, the existing persistence interfaces used in the current OLE compound document architecture are potentially unsuitable for Internet-aware controls.

In OLE compound documents, an object and its native data can be stored in two ways (see Figure 13.3). They can be embedded in the document or linked to the document. In the embedding case, the object's CLSID, its native data, and a presentation cache are stored within the document. In the linking case, a moniker and a presentation cache are stored within the document. The moniker points to a file that contains the object's CLSID and native data.

Figure 13.3 : Data can be stored in embedded and linked documents.

The problems with this architecture are as follows:

Persistent Linking and Embedding

The problem of the presentation cache was eliminated for the embedding scenario in the first release of the OLE control specification in 1994. Controls could implement a new lightweight interface called IPersistStreamInit, which can be used in preference to IPersistStorage. IPersistStreamInit allows all properties and BLOBS to be channeled into one stream and stored by the application in the document. This eliminated the cache as the data could simply and quickly be reloaded into the control when the document was loaded and the control was instantiated and initialized. In the linking case, because the object's data and properties were stored locally in files and could be retrieved quickly, the presentation cache was also eliminated.

Because of the shortcomings of the existing persistence mechanisms, Microsoft developed new persistence mechanisms and monikers that extend the concept of linking and embedding beyond the OLE Compound Document architecture (see Figure 13.4). These new mechanisms also allow for asynchronous retrieval of properties and BLOBs from remote sites, and are as follows:

Figure 13.4 : The new extensions employ the concept of persistent linking and embedding.

The persistent interface mechanisms that can now be used by a control are summarized in the following table:

UseMechanism Comments
Embedding/linkingIPersistStorage Standard persistence mechanism used in OLE compound documents. The container supplies an IStorage pointer to a storage object. The control may create any data structure within that object for its state.
Embedding/linkingIPersistStreamInit This is a lightweight alternative to IPersistStorage. All of the control's state can be serialized into one stream.
Embedding/linkingIPersistMemory The container defines a fixed size block of memory into which a control saves or retrieves its state. The control must not try to access memory outside the block.
EmbeddingIPersistPropertyBag The container and control exchange property/value pairs in Variant structures.
LinkingIPersistFile The container gives the control a UNC filename and is told to save or retrieve its state from that file.
LinkingIPersistMoniker The control is given a moniker. When the control reads or writes its state, it may choose any storage mechanism (IStorage, IStream, ILockBytes, and so on) it wants. If the storage mechanism chosen by the control is asynchronous, then IPersistMoniker must support asynchronous transfer.

For each mechanism that a control implements, the control container must provide the appropriate support.

A control container that wants to support embedding must provide the appropriate support for the persistence interfaces exposed by the control, as in the following table:

Control Persistence InterfaceContainer Supplied Support
IpersistStreamInit IStream
IpersistStorage IStorage
IpersistMemory Memory (*void)
IPersistPropertyBag IPropertyBag

A control container that wants to implement linking must use a moniker that can supply support for the persistence interfaces exposed by the control. At the time of writing, the only available moniker is the URL moniker.

If a control implements the IPersistMemory or the IPersistFile mechanism, it should also implement one other interface as both of these require that the data be present locally. These mechanisms do not work well with asynchronous downloads of properties and BLOBs.

Controls are free to implement as many of these new persistence mechanisms as the developer of the control sees fit. For maximum flexibility, therefore, a control container should implement support for as many of these interfaces as possible. This ensures that it can work with a wide range of controls that may not implement all of the new persistence mechanisms.

These new persistence mechanisms define the protocol through which the container and the control exchange information. What happens when an application decides to save a control's state (perhaps in response to a user request) depends on the application's (user's) preferences for storage. It can either be embedded in the HTML document or in a separate file and linked to the document.

When embedding is used, the control container chooses which persistence interface to use. The sequence in which a container looks for persistence interfaces is generally up to the designer of the container. However, IPersistMemory and IPersistStreamInit may be given precedence over IPersistProperyBag and IPersistStorage, as they generally produce the smallest amount of data. It is perfectly acceptable for a container to have a control save its state in one location, and then copy it to another location. All that is required is that the container is able to retrieve the saved state and give it back to the control via the same interface. For example, a container could ask a control to save its state in a memory block. The container may then save the contents of that memory block in a storage location of its choosing. When the control is initialized, the container must retrieve the saved state and give it back to the control via a memory block.

When linking is used, the container is not concerned with any of the persistence interfaces. The container must store and interact with an URL moniker. The moniker takes care of all the interface querying. URL monikers query for persistence interfaces in the following order:

In both linking and embedding, the container is responsible for the asynchronous transfer of data from the remote site. For more information on this, see the "Compound Files on the Internet" document supplied as part of the Sweeper SDK. All the persistence interfaces, with the exception of IPersistMoniker, are synchronous in operation. When a control receives a call to the Load member of one of its persistence interfaces, it expects all of the data to be available.

Data paths serve two purposes. They allow a control to store its BLOBs separately from its properties, and they solve the problem of embedded links. Controls may have links to BLOBs buried in their native data that only they know. This prohibits the container from participating in the retrieval of these BLOBs. One solution to this is data path properties. Data path properties are properties that hold text string values. These string values are simply URL file names. Data path properties can be used with either persistent embedding (see Figure 13.5) or persistent linking (see Figure 13.6).

Figure 13.5 : Data paths in a persistently embedded document.

Figure 13.6 : Data paths in a persistently linked document.

In a control's type library, data path properties must be marked as [bindable] and [requestedit]. This allows container applications to update these properties through their own user interface. These properties may also be updated through the control's property sheet. Properties are also tagged with a special custom attribute that identifies them as data path properties. The custom attribute is called GUID_PathProperty. It has its own GUID. Additionally, a control's coclass entry in its type library is also tagged with a special attribute that signifies that it has data path properties. This attribute is called GUID_HasPathProperties, and it, too, has its own GUID.

Applications such as authoring tools and Web site management tools can query controls for data path properties and use them to perform link management or other tasks.

When a control wants to retrieve the file named by a data path property, it gives the URL to the container and asks it to create a moniker for the URL. The moniker is created in the implementation of the IBindHost interface. This interface is supplied as a service by the container site's IServiceObject implementation. In order for the control to call members of IBindHost, the control must provide a way for the container to pass a pointer that identifies the IServiceObject interface. A control could implement the IOleObject interface in order to achieve this. This interface has a function called SetClientSite that allows a container to pass the pointer to the control. However, the IOleObject interface is a large interface. All its functionality may not be required by a small control. A smaller interface called IObjectWithSite can be implemented. It has just two member functions, one of which, SetSite, allows the container to pass the required pointer.

Any control that uses data path properties must support a siting mechanism either IOleObject or IOleObjectWithSite. This is a requirement of the specification.

In order to get the IBindHost interface, two steps are required. First, the control calls QueryInterface on the site pointer for the IServiceProvider interface. Then, the control calls QueryService on the IServiceProvider interface for the IBindHost interface.

In order to get a moniker for the file and data path property names, the control calls the ParseDisplayName function of the IBindHost interface. The data path may be either an absolute path name, or a path name relative to the location of the document. Either way, a moniker is returned, which the control can use to retrieve data.

When downloading data, the control should be as cooperative as possible with the container and other controls by supporting asynchronous retrieval of data. This allows the user interface to remain active while data trickles down in the background.

Before initiating a retrieval operation, a control should check to see whether the moniker that it is supplied with is an asynchronous one. It does this by calling QueryInterface on the moniker for the IMonikerAsynch interface. If this interface is not present, the moniker is synchronous and the control has to bind directly to the storage identified by the moniker by creating a bind context and calling the BindToStorage member of the moniker.

If the moniker is asynchronous, the control should get its bind context from the container through the GetBindCtx member of IBindHost. By obtaining it this way, the container has a chance to register itself as an interested party in the download process. It can monitor the download and display some sort of progress indicator for the user's convenience, or perhaps allow the user to cancel the download.

Once a control has the bind context, it registers its own FORMATETC enumerator and a pointer to its IBindStatusCallBack interface in the bind context. The control initiates an asynchronous download in the Load member of a persistence interface. In this function, another asynchronous stream should be obtained so that the moniker and the bind context can be released. This allows the Load function to return immediately, and execution control can return to the container. When data arrives, the OnDataAvailable member of the IBindStatusCallBack interface is called. The control should obtain the data exclusively through this function.

For detailed information on how the control, the container, and the moniker interact in asynchronous downloads, see the "Asynchronous Monikers" specification in the Sweeper SDK.

Data transfer may be aborted by a call to the OnStopBinding member of the BindStatusCallBack interface. If a control receives such a call, there are two possibilities. The first possibility occurs if the control has received all its data. Then the call is merely a notification that the transfer is complete. The second possibility occurs when the data transfer has been aborted for some reason.

A control may abort the data transfer by calling the Abort member of the IBinding interface. The control receives a pointer to this interface through the IStartBinding member of the IBindStatusCallBack interface.

Because the container has control and data is trickling down in the background, you may wonder how a container knows when a control is ready to begin full interaction. One way to tell is to return a new code, E_PENDING, from member functions of the control when the control is not yet ready to fully interact. When this code is not returned, the control may be ready to interact. This, however, does not allow for progressive changes in a control's ability to interact with the application or the user. Microsoft solved this problem by defining a new standard property, ReadyState, and a new standard event, OnReadyStateChanged. When the control's ready state changes, the new standard event is fired with the value of the ReadyState property to notify the container. The ReadyState property may progressively have the following values:

  • Uninitialized
  • The control is waiting to be initialized through the Load member of a persistence interface.
  • Loading
  • The control is synchronously retrieving its properties. Some may not yet be available.
  • Loaded/Can Render
  • The control has retrieved its properties and is able to draw something through the Draw member of the IViewObject2 interface.
  • Interactive
  • The control may interact with the user in a limited way. It has not yet received all of its data from the asynchronous download.
  • Complete
  • The control is completely ready.

The control does not have to support all of the above states. It has to support only as many as it needs.

When a control is requested to save its state by a call to the Save member of a persistence interface, it saves all of its properties, including data path properties as strings, through the interface, and then saves all the BLOBs referred to by any data path properties. It does this by obtaining a moniker for each data path through the container's IBindHost as described above and synchronously saving the BLOB. When the Save function returns, a control is assumed to have saved all of its state.

Instantiation of a Control

The instantiation and initialization sequence for a control is as follows. The assumption is made that the control is already on the client machine and properly registered.

The application obtains the CLSID of the control from the CLSID attribute of the HTML OBJECT tag and instantiates the control.

The DATA attribute contains either the property data encoded in MIME or an URL that names a file on a remote site that contains the property data.

If the DATA attribute contains the property data, the container obtains a persistence interface on the control and calls the Load member with a stream containing the property data.

If the DATA attribute contains an URL, the container makes an URL moniker and calls the IBindToObject member of the IMoniker interface in order to retrieve the property data from a remote site. Inside this function, the URL moniker attempts to get an IPersistMoniker interface on the control. If it succeeds, it passes a pointer to itself to the Load member of this interface. The control then has complete control over retrieval of its properties from the remote site.

Because properties are usually very small amounts of data, measured in hundreds of bytes or so, asynchronous retrieval may not be the best method. Synchronous retrieval may be a better option as it may allow the control to become interactive sooner.

If it cannot get the IPersistMoniker interface, it gets another persistence interface. It then retrieves the property data, wraps it up in an IStream object if necessary, and calls the Load member function of the interface with a pointer to the object. The control then retrieves its property data from the IStream object.

Inside the Load member of the persistence interface, the control also initiates any asynchronous download of BLOBS. It asks the container to make URL monikers so that the container may also bind to them and participate in the download process. The control binds to each moniker and registers its IOnBindStatusCallback interface in order to receive data. Control is then returned to the container.

As BLOBs trickle down in the background, the control changes the value of the ReadyState variable and notifies the container of any change in its state through the OnReadyStateChange event. The ReadyState variable is passed as a parameter of the event.

Summary of Requirements for Internet-Aware Controls

If a control has no data path properties, then a control need implement only as many of the persistence interfaces as the developer sees fit. The more interfaces a control implements, the more flexibility it has for initialization by control containers and URL monikers.

Controls that have data path properties and BLOBs must meet the following requirements:

Additionally, controls should supply a ReadyState variable and an OnReadyStateChange event if they are required.

It should support IPersistPropertyBag for supporting HTML PARAM attributes.