Audio input and output formats

We've had some questions about what audio formats the XNA content pipeline can take as input, and what formats they are converted to. Now that the 3.0 beta has shipped, I figure it's a good time to get some of that information out there. So, to be clear, this information is all true for the 3.0 beta, and will likely be true for the final release of 3.0 as well.

In v3.0, there are three main ways to have your game make some noise. Two of these are new to v3.0. First up is XACT, and the AudioEngine, WaveBank, SoundBank, and Cue types.

XACT:

The version of the XACT used by XNA Game Studio only accepts .WAV files containing PCM data as input. By default, the XACT build process will keep your data in PCM format as it built. This means your data does not go through any kind of re-encoding.

If you want, you can enable compression in XACT, which will change your assets to XMA data when built for Xbox 360, and ADPCM when built for Windows. Mitch has more information here.

MediaPlayer:

The MediaPlayer & Song API, which is new for v3.0, is designed for music playback in your game. To use it, drag a WMA or MP3 file into solution explorer, then use Content.Load to load it as a Song. Once you have a Song, you can use MediaPlayer.Play to play your music.

WMA, MP3, and WAV files can all be built as Songs. This is the default setting for WMA and MP3 files, but not WAVs: if you want to load a WAV as a Song, you have to change its processor to SongProcessor. Regardless of the input format, the SongProcessor will convert your audio assets to WMA files for use at runtime, so your music will go through some re-encoding here. The bitrate of the WMA file depends on the Conversion Quality you select, as you can see below:

Conversion Quality: Bitrate:
Best (default) 192k
Medium 128k
Low 96k

SoundEffect:

The third way to play audio in v3.0 of the framework is via the SoundEffect and SoundEffectInstance types, which are (duh) designed for sound effect playback. These are the counterpart of the Media/Song API. To use it, drag a WAV file into solution explorer, and then use Content.Load to load the file as a SoundEffect. Once the SoundEffect has been loaded, use SoundEffect.Play to play it.

WMA, MP3, and WAV files can all be built as SoundEffects. This is the default setting for WAV files, but not MP3 and WMA files. (Remember, their default setting is to be built as Songs.) If you want an MP3 or WMA file to be built as a SoundEffect, just change its processor from SongProcessor to SoundEffectProcessor.

The SoundEffectProcessor’s output format depends on two things: the target platform and the quality you select. It breaks down like this:

ConversionQuality: Windows Xbox 360 Zune
Best (default) PCM XMA 60 PCM
Medium ADPCM XMA 40 PCM 3/4*
Low ADPCM 1/2* XMA 20 PCM 1/2*

* The audio asset is down-sampled to either 3/4 or 1/2 of the original sample rate, with a lower limit of 8khz.

XMA is an internally developed sound format which has better compression ratios and sound quality than ADPCM. The Xbox 360 can decode XMA data in hardware, making XMA a great choice for a sound effect data format. I’m not privy to the details of the XMA encoding algorithm, so I can’t tell you what the quality ratings of 60, 40, and 20 mean, exactly. They’re not bitrates; they’re more of a hint to the XMA encoder as how aggressively it should compress. In practice, almost no one needs a quality setting higher than 60, so we’ve set that as our upper bound. You should definitely experiment with the quality settings on all platforms, to see how much compression you can get away with before the artifacts start to get too objectionable.