Windows with C++

Windows Imaging Component Basics

Kenny Kerr

Contents

Getting Started
Decoding Images
Encoding Images
The WIC Imaging Factory
Working with Streams
WIC via WPF
What's Next?

The Microsoft® Windows® Imaging Component (WIC) is an extensible framework for encoding, decoding, and manipulating images. Originally designed for Windows Vista® and Windows Presentation Foundation (WPF), today WIC not only ships with Windows Vista and the Microsoft .NET Framework 3.0 and later, but it is also available as a download for Windows XP and Windows Server® 2003 for use in native applications.

One of several powerful native frameworks that power WPF, WIC in this context is the framework used in the implementation of the System.Windows.Media.Imaging namespace. It is, however, also ideally suited for native applications written in C++ because it provides a simple-yet-powerful API exposed through a set of COM interfaces.

WIC supports different image formats using an extensible set of imaging codecs. Each codec supports a different image format and typically provides both an encoder and decoder. WIC includes a set of built-in codecs for virtually all of the major image formats including PNG, JPEG, GIF, TIFF, HD Photo (HDP), ICO, and of course the Windows BMP.

HDP is the only format you may not have heard about. It was originally called Windows Media Photo and was developed in conjunction with Windows Vista to overcome some of the limitations with existing formats and provide better performance and higher image quality. For more information on HDP, check out the specification at microsoft.com/whdc/xps/wmphoto.mspx. Fortunately, WIC provides great support for this new image format, so applications don't have to know the specifics of the formats to use them.

This month I'll show you how to use WIC to encode and decode different image formats and a few things in between. Next time I'll explore some of the more advanced features and show you how to extend WIC with your own imaging codecs.

Getting Started

The WIC API consists of COM interfaces, functions, structures, and error codes, as well as GUIDs identifying various codecs, containers, and formats. All of the declarations you will need are included in the wincodec.h and wincodecsdk.h header files that are provided as part of the Windows SDK (and included with Visual Studio® 2008). It will also be necessary for you to link to the WindowsCodecs.lib library, which provides the various definitions you may need. You can add the following to your project's precompiled header to make it all available:

#include <wincodec.h>
#include <wincodecsdk.h>
#pragma comment(lib, "WindowsCodecs.lib")

Since the WIC API consists primarily of COM interfaces, I use the Active Template Library (ATL) CComPtr class to handle the creation and management of the interface pointers. If you would like to do the same, you'll need to also include the atlbase.h header file that defines the CComPtr template class:

#include <atlbase.h>

The WIC API also uses the COM library, so the CoInitializeEx function must be called on any thread that will use the API.

Finally, the WIC API makes use of HRESULTs to describe errors. The samples in this article use the HR macro to clearly identify where methods return an HRESULT that needs to be checked. You can replace this with your own error-handling strategy—whether that is throwing an exception or returning the HRESULT yourself.

Decoding Images

Decoders are represented by the IWICBitmapDecoder interface. WIC provides a few ways to create a decoder object, but at a minimum you can simply use a particular decoder's CLSID to create an instance. The following example creates a decoder for a TIFF image:

CComPtr<IWICBitmapDecoder> decoder;
HR(decoder.CoCreateInstance(CLSID_WICTiffDecoder));

Figure 1 lists the codecs included with WIC and the CLSIDs you can use to create the different decoders. Once the decoder is created, it needs to be initialized with a stream containing the pixels and optional metadata in a format understood by the decoder:

Figure 1 CLSIDs for Built-In WIC Codecs

Format Decoder Encoder
BMP CLSID_WICBmpDecoder CLSID_WICBmpEncoder
PNG CLSID_WICPngDecoder CLSID_WICPngEncoder
ICO CLSID_WICIcoDecoder Not available
JPEG CLSID_WICJpegDecoder CLSID_WICJpegEncoder
GIF CLSID_WICGifDecoder CLSID_WICGifEncoder
TIFF CLSID_WICTiffDecoder CLSID_WICTiffEncoder
HDP CLSID_WICWmpDecoder CLSID_WICWmpEncoder
CComPtr<IStream> stream;

// Create stream object here
...
 
HR(decoder->Initialize(
  stream,
  WICDecodeMetadataCacheOnDemand));

I'll discuss streams later in this article, but IStream is just the traditional COM stream interface used by many APIs including the XmlLite parser I covered in the April 2007 issue of MSDN® Magazine (msdn.microsoft.com/msdnmag/issues/07/04/Xml).

The second parameter to the Initialize method describes how you would like the decoder to read the image information from the stream. WICDecodeMetadataCacheOnDemand indicates that the decoder should only read the image information from the stream as it is needed. This can be particularly useful if the image format happens to contain or support multiple frames. The alternative is WICDecodeMetadataCacheOnLoad, which indicates that the decoder should cache all image information immediately. All subsequent requests to the decoder would then be fulfilled directly from memory. This has some further implications for managed code that I'll talk about later.

With the decoder initialized, you can freely query the decoder for information. The most common thing you're likely to ask for is the set of frames that make up an image. The frames are the actual bitmaps that contain pixels. You can think of the image format as a container for frames. As I've mentioned, some image formats support multiple frames.

The GetFrameCount function is used to determine the number of frames in the image:

UINT frameCount = 0;
HR(decoder->GetFrameCount(&frameCount));

Given the number of frames, individual frames can be retrieved using the GetFrame method:

for (UINT index = 0; index < frameCount; ++index)
{
    CComPtr<IWICBitmapFrameDecode> frame;

    HR(decoder->GetFrame(index, &frame));
}

The IWICBitmapFrameDecode interface returned by GetFrame derives from the IWICBitmapSource interface that represents a read-only bitmap. IWICBitmapFrameDecode provides information associated with the frame such as metadata and color profiles. IWICBitmapSource provides the bitmap's size and resolution, pixel format, and other optional characteristics such as its color table. IWICBitmapSource also provides the CopyPixels method that can be used to actually read pixels from the bitmap.

You can get the dimensions of the frame expressed in pixels using the GetSize method:

UINT width = 0;
UINT height = 0;
HR(frame->GetSize(&width, &height));

And you can get the resolution of the frame expressed in dots per inch (dpi) using the GetResolution method:

double dpiX = 0;
double dpiY = 0;
HR(frame->GetResolution(&dpiX, &dpiY));

Although the resolution has no impact on the pixels themselves, it does affect how the image can be displayed when using a logical coordinate system such as that used by WPF.

The last critical attribute of the frame is the pixel format. The pixel format describes the layout of pixels in memory and also implies the range of colors or color space that it supports. The GetPixelFormat method returns the pixel format:

GUID pixelFormat = { 0 };
HR(frame->GetPixelFormat(&pixelFormat));

The pixel formats are defined as GUIDs whose names pretty clearly describe the memory layout. For example, the GUID_WICPixelFormat24bppRGB format indicates that each pixel uses 24 bits (3 bytes) of storage with 1 byte per color channel. In addition, the ordering of the red (R), green (G), and blue (B) letters indicates the order of the bytes from least significant to most significant. As an example, the GUID_WICPixelFormat32bppBGRA format indicates that each pixel uses 32 bits (4 bytes) of storage with 1 byte per color channel as well as 1 byte for the alpha channel. In this case, the channels are ordered with the blue (B) channel being least significant and the alpha (A) channel being the most.

The actual pixels can be retrieved using the CopyPixels method.

HRESULT CopyPixels(
  const WICRect* rect,
  UINT stride,
  UINT bufferSize,
  BYTE* buffer);

The rect parameter specifies a rectangle within the bitmap to copy. You can set this parameter to zero and it will copy the entire bitmap. I'll talk about the stride in a moment. The buffer and bufferSize parameters indicate where the pixels will be written and how much space is available.

The stride can be one of the more confusing aspects of bitmaps. Stride is the count of bytes between scanlines. Generally speaking, the bits that make up the pixels of a bitmap are packed into rows. A single row should be long enough to store one row of the bitmap's pixels. The stride is the length of a row measured in bytes, rounded up to the nearest DWORD (4 bytes). This allows bitmaps with fewer than 32 bits per pixel (bpp) to consume less memory while still providing good performance. You can use the following function to calculate the stride for a given bitmap:

UINT GetStride(
  const UINT width, // image width in pixels
  const UINT bitCount) { // bits per pixel
  ASSERT(0 == bitCount % 8);

  const UINT byteCount = bitCount / 8;
  const UINT stride = (width * byteCount + 3) & ~3;

  ASSERT(0 == stride % sizeof(DWORD));
  return stride;
}

With that out of the way, you can call the CopyPixels method as follows, assuming the frame represents a 32 bpp bitmap:

const UINT stride = GetStride(width, 32);
CAtlArray<BYTE> buffer;
VERIFY(buffer.SetCount(stride * height));

HR(frame->CopyPixels(
  0, // entire bitmap
  stride,
  buffer.GetCount(),
  &amp;buffer[0]));

My example uses the ATL CAtlArray collection class to allocate the buffer, but you can obviously use whatever storage you like. To handle larger bitmaps more efficiently, you can call CopyPixels a number of times to read different portions of the bitmap.

Encoding Images

Encoders are represented by the IWICBitmapEncoder interface. As with decoders, WIC provides a few ways to create encoders, but at a minimum you can simply use a particular encoder's CLSID to create it. This code, for example, creates an encoder for a PNG image:

CComPtr<IWICBitmapEncoder> encoder;
HR(encoder.CoCreateInstance(CLSID_WICPngEncoder));

Figure 1 lists the CLSIDs you can use to create the different encoders included with WIC. After the encoder is created, it needs to be initialized with a stream that will ultimately receive the encoded pixels and optional metadata:

CComPtr<IStream> stream;

// Create stream object here
...


HR(encoder->Initialize(
  stream,
  WICBitmapEncoderNoCache));

The second parameter to the Initialize method is less interesting, as WICBitmapEncoderNoCache is the only flag now supported.

With the encoder initialized, you can now start adding frames. The CreateNewFrame method creates a new frame that you can then configure and write pixels to:

CComPtr<IWICBitmapFrameEncode> frame;
CComPtr<IPropertyBag2> properties;

HR(encoder->CreateNewFrame(
  &frame,
  &properties));

CreateNewFrame returns both an IWICBitmapFrameEncode interface representing the new frame as well as an IPropertyBag2 interface. The latter is optional and can be used to specify any encoder-specific properties such as the image quality for JPEG or the compression algorithm for TIFF. For example, here's how you can set the image quality for a JPEG image:

PROPBAG2 name = { 0 };
name.dwType = PROPBAG2_TYPE_DATA;
name.vt = VT_R4;
name.pstrName = L"ImageQuality";

CComVariant value(0.75F);

HR(properties->Write(
  1, // property count
  &name,
  &value));

The value for image quality must fall between 0.0 for the lowest possible quality and 1.0 for highest possible quality.

With the encoder properties set, you need to call the Initialize method before configuring and writing to the frame:

HR(frame->Initialize(properties));

The next step is to set the dimensions and pixel format for the new frame before you can write pixels to it:

HR(frame->SetSize(width, height));
GUID pixelFormat = GUID_WICPixelFormat24bppBGR;
HR(frame->SetPixelFormat(&pixelFormat));
ASSERT(GUID_WICPixelFormat24bppBGR == pixelFormat);

The SetPixelFormat parameter is an [in, out] parameter. On input it specifies the desired pixel format. On output it contains the closest supported pixel format. This usually isn't a problem unless the format is set at run time, perhaps based on the pixel format of another bitmap.

Writing pixels to the frame is achieved using the WritePixels method, as shown here:

HRESULT WritePixels(
  UINT lineCount,
  UINT stride,
  UINT bufferSize,
  BYTE* buffer);

The lineCount parameter specifies how many lines of pixels are to be written. This implies that you can call WritePixels a number of times to write the complete frame. The stride parameter indicates how the pixels in the buffer are packed into rows. I described how you can calculate the stride in the previous section.

After you've called WritePixels one or more times to write the complete frame, you need to tell the encoder that the frame is ready by calling the frame's Commit method. And once you've committed all the frames that make up the image, you need to tell the encoder that the image is ready to be saved by calling the encoder's Commit method.

So far I've covered the basics of encoding and decoding images. Before I move on, I want to drive it home with a simple example. Figure 2 shows a CopyIconToTiff function that demonstrates how to read the individual bitmaps that make up an icon and copy them to a multiframe TIFF image.

Figure 2 CopyIconToTiff

HRESULT CopyIconToTiff(
   IStream* sourceStream,
   IStream* targetStream) {

  // Prepare the ICO decoder

  CComPtr<IWICBitmapDecoder> decoder;
  HR(decoder.CoCreateInstance(CLSID_WICIcoDecoder));

  HR(decoder->Initialize(
    sourceStream,
    WICDecodeMetadataCacheOnDemand));

  // Prepare the TIFF encoder

  CComPtr<IWICBitmapEncoder> encoder;
  HR(encoder.CoCreateInstance(CLSID_WICTiffEncoder));

  HR(encoder->Initialize(
    targetStream,
    WICBitmapEncoderNoCache));

  UINT frameCount = 0;
  HR(decoder->GetFrameCount(&frameCount));

  for (UINT index = 0; index < frameCount; ++index) {
    // Get the source frame info

    CComPtr<IWICBitmapFrameDecode> sourceFrame;

    HR(decoder->GetFrame(index, &sourceFrame));

    UINT width = 0;
    UINT height = 0;
    HR(sourceFrame->GetSize(&width, &height));

    GUID pixelFormat = { 0 };
    HR(sourceFrame->GetPixelFormat(&pixelFormat));

    // Prepare the target frame

    CComPtr<IWICBitmapFrameEncode> targetFrame;

    HR(encoder->CreateNewFrame(
      &targetFrame,
      0)); // no properties

    HR(targetFrame->Initialize(0)); // no properties

    HR(targetFrame->SetSize(width, height));
    HR(targetFrame->SetPixelFormat(&pixelFormat));

    // Copy the pixels and commit frame

    HR(targetFrame->WriteSource(sourceFrame, 0));
    HR(targetFrame->Commit());
  }

  // Commit image to stream

  HR(encoder->Commit());

  return S OK;
}

In this example, I've simplified things further by taking advantage of an alternative to the WritePixels method. Instead of first copying the pixels from the source frame and then writing them to the target frame, I'm using the WriteSource method that reads the pixels directly from a given IWICBitmapSource interface. Since the IWICBitmapFrameDecode interface derives from IWICBitmapSource, this provides an elegant solution.

The WIC Imaging Factory

WIC provides an imaging factory for creating various WIC-related objects. It is exposed through the IWICImagingFactory interface and can be created as follows:

CComPtr<IWICImagingFactory> factory;
HR(factory.CoCreateInstance(CLSID_WICImagingFactory));

I've already shown how you can create a decoder for a specific image format given a CLSID identifying an implementation. Of course, as you probably can tell, it would be much more useful if you didn't have to specify a particular implementation or even hardcode the format of the image.

Fortunately, WIC provides the solution. Before creating a decoder, WIC can examine a given stream for patterns that might identify the image format. Once a best match is found, it will create the appropriate decoder and initialize it with the same stream. This functionality is provided by the CreateDecoderFromStream method:

CComPtr<IWICBitmapDecoder> decoder;

HR(factory->CreateDecoderFromStream(
  stream,
  0, // vendor
  WICDecodeMetadataCacheOnDemand,
  &decoder));

The second parameter identifies the vendor of the decoder. It is optional but may be useful if you prefer a particular vendor's codec. Keep in mind that it is only a hint, and if the particular vendor does not have a suitable decoder installed, then a decoder will still be chosen regardless of vendor.

IWICImagingFactory also provides the CreateDecoderFromFilename and CreateDecoderFromFileHandle methods that provide the same functionality given a path to a file and a file handle, respectively. Alternatively, you can create a decoder without specifying a CLSID or a stream but instead indicating the image format. The CreateDecoder method does just that:

CComPtr<IWICBitmapDecoder> decoder;

HR(factory->CreateDecoder(
  GUID_ContainerFormatIco,
  0, // vendor
  &decoder));

HR(decoder->Initialize(
  stream,
  WICDecodeMetadataCacheOnDemand));

Similarly, the CreateEncoder method allows you to create an encoder for a particular image format without regard for its implementation, like so:

CComPtr<IWICBitmapEncoder> encoder;

HR(factory->CreateEncoder(
  GUID_ContainerFormatBmp,
  0, // vendor
  &encoder));

HR(encoder->Initialize(
  stream,
  WICBitmapEncoderNoCache));

Figure 3 lists the GUIDs identifying the implementation-independent image formats, also known as container formats.

Figure 3 Container Format GUIDs

Format GUID
BMP GUID_ContainerFormatBmp
PNG GUID_ContainerFormatPng
ICO GUID_ContainerFormatIco
JPEG GUID_ContainerFormatJpeg
GIF GUID_ContainerFormatGif
TIFF GUID_ContainerFormatTiff
HDP GUID_ContainerFormatWmp

Working with Streams

You are free to provide any valid IStream implementation for use with WIC. You can, for example, use the CreateStreamOnHGlobal or SHCreateStreamOnFile functions that I described in my XmlLite article, or even write your own implementation. WIC also provides a flexible IStream implementation that may come in handy.

Using the imaging factory I introduced in the previous section, you can create an uninitialized stream object as follows:

CComPtr<IWICStream> stream;
HR(factory->CreateStream(&stream));

The IWICStream interface inherits from IStream and provides a few methods for associating the stream with different backing storage. For example, you can use InitializeFromFilename to create a stream backed by a particular file:

HR(stream->InitializeFromFilename(
  L"file path",
  GENERIC_READ));

You can also use InitializeFromIStreamRegion to create a stream as a subset of another stream or use InitializeFromMemory to create a stream over a block of memory.

WIC via WPF

As I've mentioned, WIC provides the framework on which the WPF imaging functionality is based. The various imaging classes are defined in the System.Windows.Media.Imaging namespace. To show you just how easy it is to use from managed code, Figure 4 shows the CopyIconToTiff function from Figure 2 rewritten in C# using the WPF wrapper classes.

Figure 4 CopyIconToTiff Rewritten in C# and WPF

static void CopyIconToTiff(Stream sourceStream,
                           Stream targetStream) {

  IconBitmapDecoder decoder = new IconBitmapDecoder(
    sourceStream,
    BitmapCreateOptions.None,
    BitmapCacheOption.OnDemand);

  TiffBitmapEncoder encoder = new TiffBitmapEncoder();

  foreach (BitmapFrame frame in decoder.Frames) {
    encoder.Frames.Add(frame);
  }

  encoder.Save(targetStream);
}

The BitmapCacheOption.OnDemand value corresponds to the WICDecodeMetadataCacheOnDemand decoder option used in native code. And in a similar way, the alternative BitmapCacheOption.OnLoad value corresponds to the WICDecodeMetadataCacheOnLoad decoder option.

I've already described how these options influence when the decoder reads the image information into memory. There is, however, an additional side effect that you should be aware of when dealing with these options in managed code. Consider what happens when you specify BitmapCacheOption.OnDemand. The decoder will hold onto a reference to the underlying stream and may read from it at some point after the bitmap decoder object has been created. This assumes that the stream is still available. You need to be careful that your application doesn't close the stream prematurely. It's a matter of managing the lifetime of the stream so that it is not closed before the decoder is finished with it.

This doesn't affect native code because the IStream interface is a standard COM interface whose lifetime is controlled by reference counting. Your application may have released all of its references to it, but the decoder will hold onto one as long as is necessary. What's more, the stream is only closed after all interface pointers have been released.

What's Next?

WIC provides an incredibly powerful and flexible framework on which to base your imaging needs. With a generous set of codecs and a simple API, you can start taking advantage of many of its features in no time.

In my next column I'm going to explore some of the more advanced features offered by WIC. I'll show you how you can develop your own codecs and illustrate the registration and discovery process, including the pattern-matching facilities, in detail. While I'm at it, I'll also correct a limitation in one of the built-in codecs.

Send your questions and comments for Kenny to mmwincpp@microsoft.com.

Kenny Kerr is a software craftsman specializing in software development for Windows. He has a passion for writing and teaching developers about programming and software design. Reach Kenny at weblogs.asp.net/kennykerr.