Custom Audio Effects in Windows Vista - 1

Custom Audio Effects in WindowsVista

June 13, 2012

Abstract

Windows Vista allows third-party audio hardware manufacturers to include custom host-based digital signal processing effects as part of their audio driver's value-added features. These effects are packaged as user-mode System Effect Audio Processing Objects (sAPOs).

There are three insertion points for sAPOs: pre-mix render, post-mix render, and capture. Each logical device’s audio engine supports one instance of a pre-mix render sAPO per stream (render LFX) and one post-mix render sAPO (GFX). The audio engine also supports one instance of a capture sAPO (capture LFX) that is inserted in each capture stream.

This white paper provides guidelines for device driver developers who want to create and install custom sAPOs.

This information applies to the Windows Vista operating system.

Future versions of this preview information will be provided in the Microsoft Windows Driver Kit (WDK).

The current version of this paper is maintained on the Web at:
Custom Audio Effects inWindows Vista

References and resources discussed here are listed at the end of this paper.

Disclaimer: This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

© 2012 Microsoft Corporation. All rights reserved.

Document History

Date / Change
June 13, 2012 / Reworded material in sAPO Initialization paragraphand updated link to the Windows Driver Kit (WDK)
May 10, 2006 / First publication

Contents

Introduction

How Custom sAPOs are Implemented

How Custom sAPOs are Installed

Windows Vista Audio Architecture

Support for Multiple Devices

Audio Sessions

How to Install sAPOs

Registry Settings

INF Files for sAPOs

Enhancements Property Page Replacement

Run-Time Behavior of sAPOs

sAPO Bypass

sAPO Initialization

Endpoint Property Store Settings

sAPO Format Negotiation

Data Formats

The sAPO LockForProcess Method

sAPO Failure Monitoring and Automatic Disabling

Application Control over sAPOs

How to Implement a Custom sAPO

Required Interfaces

The CBaseAudioProcessingObject Class

Signal Processing Requirements

How to Implement a UI for Configuring the Effects

Resources

Introduction

The Microsoft Universal Audio Architecture (UAA) family of class drivers and the associated INF files are included with Windows Vista. These drivers support a set of user-mode System Effect Audio Processing Objects (sAPOs) that use the Windows Vista system effect infrastructure. Windows Vista uses sAPOs to implement several digital signal processing (DSP) audio effects algorithms including:

  • Speaker fill
  • Headphone virtualization
  • Room correction
  • Output encoding for external decoders
  • Automatic gain control
  • Microphone array filters

When users install an audio device driver by using the standard INF file, they automatically have access to the system's sAPOs. Independent hardware vendors (IHVs) and original equipment manufacturers (OEMs) can provide additional custom system effects while still using the Microsoft class drivers. They do so by packaging their DSP algorithms as sAPOs and modifying the standard INF file to insert their sAPOs into the audio engine’s signal processing graph.

This white paper provides guidelines for vendors who want to create and install custom sAPOs.

How Custom sAPOs are Implemented

Custom sAPOs are implemented as in-process COM objects, so they run in user mode and are packaged in a dynamic-link library (DLL). There are three types of sAPO, based on where they are inserted in the signal processing graph. Each logical device can be associated with one sAPO of each type.

  • A local render effect (render LFX) sAPO processes an audio stream from a particular application just before mixing. If multiple applications are involved, each application has one instance of the LFX sAPO per stream.
  • A local capture effect (capture LFX) sAPO processes an audio stream from an application just before mixing. If multiple applications are involved, each application has one instance of the capture LFX sAPO per stream.
  • A global effect (GFX) sAPO processes the audio stream after mixing. A logical device, such as line out, can have only one instance of the GFX sAPO. If other logical devices such as headphone output or speaker output exist, they get their own instance of the GFX sAPO.

Note: For convenience, the terminology in this white paper refers mostly to output devices. However, the technology is symmetric and works essentially in reverse for input devices. The primary difference is that there is no way to insert a GFX sAPO into the capture graph.

LFX and GFX sAPOs operate on a single input or output stream. They cannot be used in “full-duplex” mode to process both input and output data. The infrastructure is not designed to accommodate effects such as acoustic echo cancellation (AEC) and microphone array processing. With Windows Vista, these types of processing algorithms are located in the application graph, above the audio subsystem. Windows Vista ships with high-quality implementations of both algorithms in that location.

More details on how to implement custom sAPOs are given later in this paper.

How Custom sAPOs Are Installed

Custom sAPOs are installed with the audio device driver and linked to specific Plug and Play hardware ID. This ensures that a hardware manufacturer can specify the LFX and GFX sAPOs for an audio device and be confident that they will work well together.

  • Each Plug and Play hardware ID can be associated with only one GFX and one LFX sAPO. However, if developers require variable behavior that is based on different stream or device characteristics, an sAPO can contain multiple DSP algorithms internally that can be used exclusively or together.
  • Modern PCs normally have separate logical devices for headphone, line out, Sony/Philips Digital Interface (SPDIF) out, speaker out, and so on. When there are multiple logical devices, each one can have its own LFX and GFX sAPO.
  • As with drivers, custom sAPOs must go through the Windows Logo Program signing process. There are two ways to submit sAPOs to WHQL for signing:
  • An sAPO that is associated with a custom driver should be submitted with the driver package.
  • An sAPO that is linked by an INF to one of the Windows Vista UAA class drivers can be submitted separately.

An sAPO commonly provides a user interface (UI) that allows a user to configure the effects. This UI can, for example, allow the user to select from several different signal processing algorithms. Microsoft provides a configuration UI for the standard Windows Vista sAPOs. If a custom sAPO has user-accessible settings, the manufacturer must provide an appropriate configuration UI. The configuration UI is installed with the device driver and is associated with the sAPO by registration.

Note: The Microsoft-supplied Enhancements property page is associated with the Microsoft home theater sAPOs and cannot be modified. However, manufacturers can replace this property page with a custom property page that is designed to support their sAPOs. However, if manufacturers replace the native Windows Vista sAPOs, they must either mirror the functionality of the native features or wrap those features into their own offering to prevent any loss of features for Windows Vista users.

More details on how to install custom sAPOs are given later in this paper.

Windows Vista Audio Architecture

Figure 1 shows how the sAPOs for a logical device are incorporated into the Windows Vista audio architecture. The audio engine runs in user mode and supports a signal processing graph that processes the audio stream. An sAPOs is essentially a plug-in to the signal processing graph. Some notes on the figure include:

  • There is one LFX sAPO for each logical device, but each application that is streaming to the audio engine has its own instance of that sAPO.
  • The GFX sAPO is inserted in the engine’s device graph. Each logical device has only a single instance of the audio engine and GFX sAPO. The GFX sAPO passes the audio stream to the PortCls audio adapter and ultimately to the output device.
  • An sAPO runs as LocalService and cannot access any resources that do not have the appropriate access control list (ACL) setting.
  • An sAPO runs with limited privileges. The exact set depends on which services are running in the audio service's svchost instance. However, sAPOs should not be performing any operations that require privileges.
  • An sAPO should not access the network.

Figure 1. Basic Windows Vista audio architecture

Support for Multiple Devices

Although each logical device can have only one audio engine—LFX sAPO, and GFX sAPO—the system can support multiple logical devices. Each has its own audio engine and LFX and GFX sAPOs. Figure 2 shows a system with three logical devices.

Figure 2. Windows Vista audio architecture with multiple devices

Audio Sessions

An audio session is a group of related audio streams that a client can manage collectively. Each session represents a subset of the streams that form the global mix that plays through a particular audio device such as headphones. The global mix combines all of the sessions from all of the applications. Clients control the volume level and mute state of each individual session, and the system applies these settings uniformly to all of the streams in the session.

Typically, a session consists of one or more streams from a single process. However, applications can define cross-process sessions that combine streams from two or more processes. Figure 3 shows how sAPOs are integrated into audio sessions. Note that:

  • Each audio stream has its own instance of the LFX sAPO.
  • Streams from different processes can belong to the same session.
  • Multiple streams from the same process can belong to the same or different sessions.
  • The streams for all the device’s sessions pass through a single GFX.

Figure 3. Audio session architecture

For further information on audio sessions and the core audio API, see the white paper titled Device Finish-Install Actions in Windows Vista.

How to Install sAPOs

Custom sAPOs are installed with the associated device driver by using an INF file. The driver package includes the LFX and GFX sAPOs and any associated configuration UI. The device installation program or a setup program copies the sAPOs and configuration UI to the system and registers them.

A GFX or LFX sAPO is identified by its COM class ID (CLSID). The INF can associate each physical output device’s PnP ID with only one GFX and one LFX. The association is specified with registry directives in the INF file. The audio engine accesses this registry information through the IPropertyStore interfaces on Multimedia Device API (MMDevAPI) objects.

Because an audio adapter can support multiple audio inputs and outputs, an sAPO might not be compatible with all input types. If so, the sAPO must be explicitly associated with the compatible kernel-streaming (KS) node types.

Specifying the compatible KS node types may still not be sufficient for multiplexed capture devices such as microphone arrays, which can have multiple inputs with the same KS node type. In that case, the sAPO must determine whether it can operate on the current endpoint by querying the audio driver or endpoint property store for additional information. If the system effect cannot operate on the endpoint, the sAPO should behave like a copy or pass-through APO. This means that the sAPO should bypass its internal DSP algorithms and not fail initialization.

Note: The audio engine does not normally load an unsigned sAPO. However, signing is not done until the development process is complete and the driver has been submitted to WHQL. For development and test purposes, the signing requirement can be bypassed by setting the DisableProtectedAudioDG registry value to 1, as shown in this example:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Audio

"DisableProtectedAudioDG"=dword:00000001

Registry Settings

The association between LFX or GFX sAPOs and the related device is stored under the registry key for the device interface. The DLL that contains the sAPOs must self-register by including a ‘RegisterDlls’ statement in the INF file.

The HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Control/DeviceClasses key has a subkey for each device interface that is named with the interface’s CLSID string. Under each CLSID key is a subkey for each device that exports the interface. For example, ##?#PCI#VEN_1106&DEV_3059&SUBSYS_810A1043&REV_60#3&61AAA01&1&8D#{6994ad04-93ef-11d0-a3cc-00a0c9223196} is the subkey of {6994AD04-93EF-11D0-A3CC-00A0C9223196} that represents the device for the VIA onboard AC'97 audio controller with a RealTek AC'97 codec.

Each device key has subkeys for all the interfaces that it exposes. For example, the device subkey under discussion has subkeys for wave, topology, and Universal Asynchronous Receiver Transmitter (UART) interfaces that are named #Wave, #Topology, and #UART, respectively. The following example shows these settings schematically. Note that the device subkey string has been truncated for readability.

HKEY_LOCAL_MACHINE

SYSTEM

CurrentControlSet

Control

DeviceClasses

{6994AD04-93EF-11D0-A3CC-00A0C9223196}

##?#PCI#VEN_1106&DEV_3059&SUBSYS_810A1043&REV_60…

#Topology

#UART

#Wave

To register a GFX or LFX sAPO, add a DeviceParameters subkey to the appropriate topology subkey, followed by an FX subkey. The FX key can have one or more system effect subkeys, one for each effect's sAPO. The system effect subkey names must be integers, starting with zero.

The data for the property store is contained in a series of values that are associated with the system effect key. The value names are globally unique identifier (GUID) strings, followed by an ID, much like property store key names. The property names and the associated data are listed in the following table.

Property name / ID / Type / Data
SysFxAssociation / 0 / Multistring / KSNODE_TYPE GUIDs that associate the sAPO with the connection types that the sAPO supports. Use a NULL GUID to associate the sAPO with any connection type. If the connection type does not match the GUID, the system effects are not added.
PreMixEffect / 1 / String / The CLSID that is associated with the data and code that are used to create the LFX sAPO.
PostMixEffect / 2 / String / The CLSID that is associated with the data and code that are used to create the GFX sAPO.
UserInterface / 3 / String / The CLSID that is associated with the data and code that are used to create the configuration UI for the system effects package.
FriendlyName / 4 / String / A friendly name for the system effects package.
Additional entries as appropriate / >4 / User-defined / Initialization data for the sAPO.

The value name for each property has the form “{GUID},ID”, where GUID is the standard property key GUID string, {D04E05A6-594B-4fb6-A80D-01AF5EED7D1D}. It is defined in AudioEngineBaseAPO.idl and is the same for all properties. For example, the value for the SysFxAssociation property is named “{D04E05A6-594B-4fb6-A80D-01AF5EED7D1D},0”.

Figure 4 is a screenshot of RegEdit that shows a typical example of how system effect keys and values appear in the registry. The device has one sAPO that is named Microsoft Audio Home Theater Effects. It works with any connection type, and has LFX, GFX, and configuration UI sAPOs.

Figure 4. Registry settings for Microsoft Audio Home Theater Effects

INF Files for sAPOs

This section shows examples from an INF that is used to install an sAPO. For readability, it is useful to define friendly names for the property store value names. For example:

; PropertyKeys

PKEY_FX_Association = "{D04E05A6-594B-4fb6-A80D-01AF5EED7D1D},0"

PKEY_FX_PreMixClsid = "{D04E05A6-594B-4fb6-A80D-01AF5EED7D1D},1"

PKEY_FX_PostMixClsid = "{D04E05A6-594B-4fb6-A80D-01AF5EED7D1D},2"

PKEY_FX_UiClsid = "{D04E05A6-594B-4fb6-A80D-01AF5EED7D1D},3"

PKEY_FriendlyName = "{ D04E05A6-594B-4fb6-A80D-01AF5EED7D1D},4"

The following example shows the INF text to add the keys and property values to the registry:

HKR,"FX\\0",%PKEY_FriendlyName%,,%FX_FriendlyName%

HKR,"FX\\0",%PKEY_FX_PreMixClsid%,,%FX_PREMIX_CLSID%

HKR,"FX\\0",%PKEY_FX_PostMixClsid%,,%FX_POSTMIX_CLSID%

HKR,"FX\\0",%PKEY_FX_UiClsid%,,%FX_UI_CLSID%

HKR,"FX\\0",%PKEY_FX_Association%,,%KSNODETYPE_ANY%

The final element of each line is a friendly name for the associated property store data. The exact values are defined elsewhere in the INF file, but not shown here.

It is possible to register multiple sAPOs by creating additional subkeys under FX and adding the sAPO's property store data to these keys. However, the names of the subkeys that contain the property store data must be sequential integers, starting with zero. The above example registers a single sAPO, so the property store data goes under FX\0. The code to register a second sAPO would look similar, but the data would go under FX\1, the third under FX\2, and so on.