Understanding Unified Messaging Audio Codecs

Microsoft Exchange Server 2007 will reach end of support on April 11, 2017. To stay supported, you will need to upgrade. For more information, see Resources to help you upgrade your Office 2007 servers and clients.

 

Applies to: Exchange Server 2007, Exchange Server 2007 SP1, Exchange Server 2007 SP2, Exchange Server 2007 SP3

In Microsoft Exchange Server 2007 Unified Messaging (UM), a codec is used to store voice mail messages. Another codec is used between an IP gateway or IP Private Branch eXchange (PBX) and a server that is running Exchange 2007 that has the Unified Messaging server role installed. Exchange 2007 Unified Messaging can use any of the following three audio codecs to create and store voice messages:

  • Windows Media Audio (WMA)

  • Group System Mobile (GSM) 06.10

  • G.711 Pulse Code Modulation (PCM) Linear

However, the G.711 (PCMA and PCMU) and the G.723.1 codecs are VoIP codecs are used between an IP gateway and the Unified Messaging server.

Part of planning your Unified Messaging system involves selecting the correct audio codec based on the needs and requirements of your organization. This topic discusses the audio codecs that Exchange 2007 Unified Messaging can use and will help you plan your UM deployment.

Important

On 64-bit Unified Messaging servers, you must install the Windows Media Encoder if you plan to use the WMA UM dial plan codec. For more information about how to install the Windows Media Encoder, see Availability of the Windows Media Audio 9 Voice codec for x64-based computers or visit the Microsoft Download Center.

Codecs

Two types of codecs are used in Exchange 2007 Unified Messaging: the codec that is used between IP gateways and the Unified Messaging server or between a PBX and IP gateway, depending on the type of PBX, and the codec that is used to encode and store voice messages for users.

The term "codec" is a combination of the words "coding" and "decoding" and is used with digital audio data. A codec is a software program that transforms digital data into an audio file format or audio streaming format. Codecs are used to convert an analog voice signal to a digital version of the voice signal. Codecs can vary in their sound quality, the bandwidth that is required to use them, and the system requirements that are needed to do the encoding.

When you use an ordinary telephone over the Public Switched Telephone Network (PSTN) your voice is transported in an analog format over the telephone line. But with Voice over IP (VoIP), your voice must be converted into digital signals. This conversion process is known as encoding. Encoding is performed by a codec. After the digitized voice has reached its destination, it must then be decoded back to its original analog format so that the person on the other end of the call can hear and understand the caller.

VoIP Codec

In Exchange 2007 Unified Messaging, three types of codecs can be used between IP gateways and the Unified Messaging server or between a PBX and IP gateway, depending on the type of PBX. Unified Messaging servers can accept the following VoIP codecs from an IP gateway or IP PBX:

  • G.711 µ-law

  • G.711 A-law

  • G.723.1

G.711 is a standard that was developed for use with audio codecs. There are two main algorithms defined in the standard for G.711.The µ-law algorithm that is used in North America and Japan and the A-law algorithm that is used in Europe and other countries. The G.723.1 audio codec is mostly used in VoIP applications and requires a license to be used. G.723.1 is a high quality, high compression type of codec.

Both a Unified Messaging server and a supported IP gateway or IP PBX can offer both the G.711 and G.723.1 codec. However, the Unified Messaging server will choose its preferred codec based on the WireCodecList key in the registry. By default, the first codec to be used is G.723.1. If you want to use a different codec than G.723.1 between the Unified Messaging server and the IP gateway or IP PBX, we recommend that you change the configuration on the IP gateway or IP PBX and do not add, remove, or change any values for the WireCodecList key in the registry. The Unified Messaging server will determine the codec that is being used by the IP gateway or IP PBX and select the appropriate codec from the list in the registry.

The following table summarizes some common VoIP codecs.

VoIP codecs

VoIP codec Bandwidth (Kbps) Description

G.711

64

This codec requires very low processing. It needs a minimum of 128 kilobits per second (Kbps) for two-way communication.

G.723.1

5.3/6.3

This codec offers high compression with high quality audio. It requires more processing than the G.711 codec. The G.723.1 codec uses reduced bandwidth but offers poorer quality audio.

UM Voice Message Storage Codec

Unified Messaging dial plans are integral to the operation of Exchange 2007 Unified Messaging. By default, when you create a UM dial plan, the UM dial plan uses the WMA audio codec. However, after you create the UM dial plan, you can configure the UM dial plan to use GSM 06.10 or G.711 PCM Linear audio codecs.

Each audio codec has advantages and disadvantages. The WMA audio codec was selected as the default audio codec because of its sound quality and compression properties. GSM 06.10 and G.711 PCM Linear audio codecs were included as available options because of their ability to support other types of messaging systems.

When you plan for Exchange 2007 Unified Messaging, you must balance the size and the relative quality of the audio file that will be created for voice messages. Generally, the higher the bit rate for an audio file, the higher the quality. However, you must also consider whether the audio file is compressed. The sample bit rate (bit/sec) and compression properties for each audio codec that is used in Exchange 2007 Unified Messaging are as follows:

UM voice message storage codecs

Voice message storage codec Bits Compressed file?

WMA

16 bit

Yes

G.711 PCM

16 bit

No

GSM 06.10

8-bit

Yes

In Exchange 2007 Unified Messaging, the WMA, G.711 PCM Linear, and GSM 06.10 audio codecs are used to create .wma and .wav audio files for voice messages. However, the file type that is created depends on the audio codec that is used to create the voice message audio file. In Exchange 2007 Unified Messaging, the .wma audio codec creates .wma audio files and the GSM 06.10 and G.711 PCM Linear audio codecs produce .wav audio files. Both kinds of audio files are sent together with the e-mail message to the recipient of the voice message.

Frequently, but not always, coding and decoding the digital data also involves compression or decompression. Audio compression is a form of data compression that reduces the size of audio data files. The audio compression algorithm that is used by the audio codec compresses the .wma or .wav audio files. In Exchange 2007 Unified Messaging, the type of audio compression algorithm that is used is based on the type of audio codec that is selected in the UM dial plan properties. After the audio file is created and compressed, it is attached to the voice message.

Sometimes information from the digital data is lost during compression and decompression. The higher the compression that is used to compress the audio file, the greater the loss of information during the conversion. However, less disk space is used because size of the audio file is reduced. Conversely, the lower the compression, the lower the loss of the information. However, more disk space must be used because of the increased size of each audio file.

New in Exchange 2007 SP1

Exchange 2007 Service Pack 1 (SP1) added support for RTAudio wideband or high fidelity audio for recording voice messages. However, high fidelity audio is available only after you have successfully integrated Exchange 2007 Unified Messaging with Office Communications Server 2007. To enable RTAudio, the UM dial plan must be configured as a Session Initiation Protocol (SIP) URI-type dial plan and you must set the call answering codec on the dial plan to WMA.

Important

RTAudio is not available in environments where Office Communications Server 2007 is not deployed. This is because, in these environments, the dial plan is set to Telephone Extension and not SIP URI.

There are two media streams for each incoming call: inbound to a Unified Messaging server and outbound from a Unified Messaging server. When the dial plan type is set to SIP URI and the call-answering codec on the dial plan is set to WMA, a Unified Messaging server tries to select the RTAudio VoIP codec for the inbound media stream. If negotiation is successful, the RTAudio codec for the inbound stream will be used for call answering calls or calls that originate from Office Communicator 2007.

Note

Calls placed by using the Play on Phone feature will not use the RTAudio codec. The inbound stream for calls placed by using Play on Phone will use the G.711 or G.723.1 codec.

When the RTAudio codec is used, the voice message that is recorded will be recorded in high fidelity and will be stored as an audio file that has a .wma extension. When the voice message is played back to the user in Office Outlook 2007 or Outlook Web Access, they will hear the voice message in high fidelity audio. If negotiation is unsuccessful, either the G.711 or G.723.1 codec will be used. Both the G.711 and the G.723.1 codecs are narrowband codecs. When they are used as the VoIP codec, the voice message is recorded and stored as a narrowband audio file that has a .wma extension.

The outbound media stream will always be negotiated by using either the G.711 or G.723.1 codec. This means that callers will always hear narrowband audio over the telephone. This also applies to situations when a call is placed by using Office Communicator.

RTAudio processing by a Unified Messaging server consumes more CPU cycles than any of the G.711 or G.723.1 codecs. If you have successfully integrated Office Communications Server 2007 but you want to turn off RTAudio to reduce the number of CPU cycles that are used, you can:

  • Set the dial plan call answering codec or storage codec to GSM or PCM.

  • Disable the setting in the registry. The registry key is: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Speech Server\2.0\WireCodecList. To disable RTAudio, remove RTAudio16KHz and RTAudio8KHz from the list of codecs in the registry key.

    Important

    Any other additions, modifications, or deletions to the other values or keys in the WireCodecList key are not supported.

Note

Incorrectly editing the registry can cause serious problems that may require you to reinstall your operating system. Problems resulting from editing the registry incorrectly may not be able to be resolved. Before editing the registry, back up any valuable data.

UM Message Sizing

You can configure UM to use one of the three following audio codecs for creating voice messages: WMA, GSM 06.10, and G.711 PCM Linear. The WMA audio codec is always stored in the Windows Media format and the attachment is a file that has a .wma file name extension. Audio files encoded by using the GSM or G.711 PCM Linear audio codecs are always stored in RIFF/WAV format, and the attachment is a file that has a .wav file name extension.

The size of Unified Messaging voice messages depends on the size of the attachment that holds the voice data. In turn, the size of the attachment depends on the following factors:

  • The duration of the voice mail recording

  • The audio codec that is used

  • The audio file storage format

The following figure illustrates how the size of the audio file depends on the duration of the voice mail recording for the three audio codecs that you can use in UM.

Note

In this figure, the average length of a call-answered voice message is approximately 30 seconds.

Audio file size

UM_Message_Sizing

WMA

WMA is the most highly compressed audio codec of the three kinds of codecs. The compression is approximately 11,000 bytes for each 10 seconds of audio. However, the .wma file format has a much larger header section than the .wav file format. The .wma file header section is approximately 7 kilobytes (KB), whereas the header section for the .wav file is less than 100 bytes. Although WMA audio recordings are recorded for longer than 15 seconds, they become smaller than GSM audio recordings. Therefore, for the smallest, yet highest quality audio files, use the WMA audio codec.

G.711 PCM Linear

The G.711 PCM Linear audio codec creates .wav audio files that are not compressed. Therefore, G.711 PCM Linear .wav audio files occupy the most space for any given duration when they are compared to the GSM and WMA audio codecs. G.711 PCM Linear .wav audio files occupy just over 160,000 bytes for each 10 seconds of audio. G.711 PCM Linear .wav audio files have the highest audio quality of the three audio codecs that are used by Exchange 2007 Unified Messaging. However, the quality of comparable audio files that are created by using the WMA and GSM audio codecs are acceptable to most users who listen to voice messages.

GSM

The GSM audio codec creates .wav audio files that are compressed. GSM .wav audio files are just over 16,000 bytes for each 10 seconds of audio. However, GSM creates an audio file that is larger than the audio file that is created by the WMA audio codec. Therefore, when you are balancing the quality of the voice message and the size, this may not be the best choice.

For More Information

For more information about UM dial plans, see Understanding Unified Messaging Dial Plans.

For more information about how to configure the audio codec on a UM dial plan, see How to Modify a Unified Messaging Dial Plan.