Background of Non-PCM Support

Several issues prevented earlier versions of Microsoft Windows from supporting non-PCM formats through the waveOut and DirectSound APIs. These issues and how they have been resolved are discussed below.

waveOut API

The software layer that separates waveOut applications from VxD wave drivers is fairly thin. Drivers and applications that support a custom wave format can stream data in that format regardless of whether the operating system understands the format.

However, in Windows 2000 and Windows 98, the WDM audio framework forces all the audio data that is processed by the waveOut API (and DirectShow's waveOut renderer) to pass through the KMixer system driver (Kmixer.sys), which is the kernel audio mixer. A waveOutOpen call succeeds only if KMixer supports the format, regardless of whether the driver supports the format.

KMixer handles WAVE_FORMAT_PCM on all WDM operating systems. Windows 2000 and later, and Windows 98 SE, extend KMixer to support not only WAVE_FORMAT_IEEE_FLOAT but also WAVEFORMATEXTENSIBLE variants of PCM and IEEE-float formats. Because KMixer supports no non-PCM formats, an attempt to pass non-PCM data through KMixer fails.

Windows supports non-PCM formats by allowing non-PCM audio data to simply bypass KMixer. Specifically, waveOut non-PCM data flows directly to PortCls (or USBAudio) instead of first passing through KMixer. Any mixing of non-PCM data must be done in hardware, but applications that use non-PCM data in a format such as AC-3 or WMA Pro typically do not require mixing and drivers typically do not support hardware mixing in that format.

DirectSound API

On legacy waveOut drivers and VxD drivers, DirectSound supports WAVEFORMATEX (but not WAVEFORMATEXTENSIBLE) PCM formats for both primary and secondary buffers, with 8 or 16 bits per sample, one or two channels, and a sampling rate between 100 Hz and 100 kHz. VxD drivers can further limit the formats allowed for primary buffers when the cooperative level is set to DSSCL_WRITEPRIMARY (see the description of the IDirectSoundBuffer::SetFormat method in the DirectX SDK). These limitations have not changed in Windows Me or Windows XP.

WDM drivers can support PCM formats in both WAVEFORMATEX and WAVEFORMATEXTENSIBLE form. For Windows 2000 and later, Windows Me, and Windows 98 SE, drivers can also support the WAVE_FORMAT_IEEE_FLOAT format for both primary and secondary DSBCAPS_LOCSOFTWARE buffers (mixed by KMixer) in both WAVEFORMATEX and WAVEFORMATEXTENSIBLE form.

Calling SetFormat to specify the data format of a primary buffer has only an indirect effect on the final mixing format chosen by the sound card. The primary buffer object is used to obtain the IDirectSound3DListener interface and to set the device's global volume and pan, but does not represent the single output stream from the sound card. Instead, KMixer mixes the primary-buffer data in order to allow several DSSCL_WRITEPRIMARY DirectSound clients to run simultaneously.

On Windows 2000 and Windows 98, DirectSound supports only PCM data. (The same is true of DirectShow, which uses DirectSound's renderer.) A call to CreateSoundBuffer with a non-PCM format always fails, even if the driver supports the format. Failure occurs for two reasons. First, whenever DirectSound creates a KS pin, it automatically specifies KSDATAFORMAT_SUBTYPE_PCM instead of deriving the subtype from the wFormatTag member of the WAVEFORMATEX structure that is used to create the IDirectSoundBuffer object. Second, DirectSound requires every data path to have volume and SRC (sample-rate conversion) nodes (KSNODETYPE_VOLUME and KSNODETYPE_SRC), regardless of whether the client requests pan, volume, or frequency controls on the DirectSound buffer. This requirement is met if either the data passes through KMixer or the device performs hardware mixing. For non-PCM formats, however, KMixer is not present in the data path and the drivers themselves typically fail when asked to perform hardware mixing.

Windows XP and later, and Windows Me, remove the limitations that prevented DirectSound applications from using non-PCM formats. DirectSound 8 (and later versions) uses the correct format subtype and no longer automatically requires volume and SRC nodes in every data path.