Share via

[C#] [UWP] AudioGraph / AudioFileInputNode does not start at specified position in milliseconds (mp3)

Manuel Kurtz 21 Reputation points
2021-02-12T09:18:13.143+00:00

Hello,

I'm writing a small application which requires to start audio file playback at a specified position in milliseconds. To accomplish this, I use this code:

/// <summary>
/// Initializes the AudioGraph
/// </summary>
/// <param name="renderDevice">The audio render device (speaker)</param>
/// <returns>true on success</returns>
private async Task<bool> InitAudioGraph(DeviceInformation renderDevice = null)
{
    if (audioGraph != null) return true;

    var audioGraphSettings = new AudioGraphSettings(AudioRenderCategory.Media)
    {
        PrimaryRenderDevice = renderDevice
    };

    var result = await AudioGraph.CreateAsync(audioGraphSettings);

    if (result.Status == AudioGraphCreationStatus.Success)
    {
        this.audioGraph = result.Graph;

        return true;
    }

    return false;
}



/// <summary>
/// Plays the given audio file.
/// </summary>
/// <param name="file"></param>
/// <param name="startMillis">If provided, playback will be started at the appropriate position in the file</param>
/// <param name="endMillis">If provided, playback will be stopped at the appropriate position in the file</param>
/// <returns></returns>
public async Task<bool> PlayFile(StorageFile file, float startMillis = 0, float endMillis = 0)
{
    if (await InitAudioGraph())
    {
        this.playFile = file;
        this.playFileEndMillis = endMillis;

        // prepare and playback recorded audio data
        var deviceResult = await audioGraph.CreateDeviceOutputNodeAsync();
        if (deviceResult.Status != AudioDeviceNodeCreationStatus.Success) return false;

        var outputNode = deviceResult.DeviceOutputNode;
        var playback = await audioGraph.CreateFileInputNodeAsync(file);

        if (playback.Status != AudioFileNodeCreationStatus.Success) return false;

        var fileInputNode = playback.FileInputNode;
        fileInputNode.AddOutgoingConnection(outputNode);

        // set the start time
        if (startMillis > 0)
        {
            var startTime = TimeSpan.FromMilliseconds(startMillis);
            if (startTime < fileInputNode.Duration)
            {
                fileInputNode.Seek(startTime);
            }
        }

        // set the end time
        if (endMillis > 0)
        {
            var endTime = TimeSpan.FromMilliseconds(endMillis);
            if (endTime <= fileInputNode.Duration)
            {
                fileInputNode.EndTime = endTime;
            }
        }

        // start playback
        audioGraph.Start();

        return true;
    }

    return false;
}

This is working fine when using uncompressed audio files like *.wav - however, when using *.mp3, the playback begins at various locations somewhere around the specified startMillis. In my test case, it could be about 1s differing from the specified startMillis. I compared this with an audio editing tool.

How can I establish that an mp3-file starts at the exact specified position in the audio file? I already tried to change fileInputNode.Seek to fileInputNote.StartTime = startTime which has the same result as fileinputNode.Seek.

Thanks in advance,
Manuel Kurtz

Developer technologies | Universal Windows Platform (UWP)
0 comments No comments

Answer accepted by question author

Ivanich 306 Reputation points
2021-02-14T09:40:32.827+00:00

AudioGraph uses Media Foundation to decode files. When you call Seek, that call ends with the call of IMFSourceReader::SetCurrentPosition which does not guarantee exact seeking:

The SetCurrentPosition method does not guarantee exact seeking. The accuracy of the seek depends on the media content. If the media content contains a video stream, the SetCurrentPosition method typically seeks to the nearest key frame before the desired position. The distance between key frames depends on several factors, including the encoder implementation, the video content, and the particular encoding settings used to encode the content. The distance between key frame can vary within a single video file (for example, depending on scene complexity).

Compressed mp3 files consist of frames as well as video files. Frame size depends on encoder settings, and usually it is about 320 milliseconds.

So, if you want to have exact seeking, you have to handle it yourself.

Was this answer helpful?

1 person found this answer helpful.
0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Manuel Kurtz 21 Reputation points
    2021-02-15T10:07:40.127+00:00

    Thanks for the answer.

    It would be nice if this information was added to the documentation of the functions / properties (https://learn.microsoft.com/en-us/uwp/api/Windows.Media.Audio.AudioFileInputNode?view=winrt-19041).

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.