Transforms - Create Or Update

Reference

Service:: Media Services

API Version:: 2022-07-01

Create or Update Transform
Creates or updates a new Transform.

PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Media/mediaServices/{accountName}/transforms/{transformName}?api-version=2022-07-01

URI Parameters

Name	In	Required	Type	Description
accountName	path	True	string	The Media Services account name.
resourceGroupName	path	True	string	The name of the resource group within the Azure subscription.
subscriptionId	path	True	string	The unique identifier for a Microsoft Azure subscription.
transformName	path	True	string	The Transform name.
api-version	query	True	string	The version of the API to be used with the client request.

Request Body

Name	Required	Type	Description
properties.outputs	True	TransformOutput[]	An array of one or more TransformOutputs that the Transform should generate.
properties.description		string	An optional verbose description of the Transform.

Responses

Name	Type	Description
200 OK	Transform	OK
201 Created	Transform	Created
Other Status Codes	ErrorResponse	Detailed error information.

Examples

Create or update a Transform

Sample request

PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contosoresources/providers/Microsoft.Media/mediaServices/contosomedia/transforms/createdTransform?api-version=2022-07-01

{
  "properties": {
    "description": "Example Transform to illustrate create and update.",
    "outputs": [
      {
        "preset": {
          "@odata.type": "#Microsoft.Media.BuiltInStandardEncoderPreset",
          "presetName": "AdaptiveStreaming"
        }
      }
    ]
  }
}


import com.azure.resourcemanager.mediaservices.models.BuiltInStandardEncoderPreset;
import com.azure.resourcemanager.mediaservices.models.EncoderNamedPreset;
import com.azure.resourcemanager.mediaservices.models.TransformOutput;
import java.util.Arrays;

/**
 * Samples for Transforms CreateOrUpdate.
 */
public final class Main {
    /*
     * x-ms-original-file:
     * specification/mediaservices/resource-manager/Microsoft.Media/Encoding/stable/2022-07-01/examples/transforms-
     * create.json
     */
    /**
     * Sample code: Create or update a Transform.
     * 
     * @param manager Entry point to MediaServicesManager.
     */
    public static void createOrUpdateATransform(com.azure.resourcemanager.mediaservices.MediaServicesManager manager) {
        manager.transforms().define("createdTransform").withExistingMediaService("contosoresources", "contosomedia")
            .withDescription("Example Transform to illustrate create and update.")
            .withOutputs(Arrays.asList(new TransformOutput()
                .withPreset(new BuiltInStandardEncoderPreset().withPresetName(EncoderNamedPreset.ADAPTIVE_STREAMING))))
            .create();
    }
}

To use the Azure SDK library in your project, see this documentation. To provide feedback on this code sample, open a GitHub issue

from azure.identity import DefaultAzureCredential
from azure.mgmt.media import AzureMediaServices

"""
# PREREQUISITES
    pip install azure-identity
    pip install azure-mgmt-media
# USAGE
    python transformscreate.py

    Before run the sample, please set the values of the client ID, tenant ID and client secret
    of the AAD application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID,
    AZURE_CLIENT_SECRET. For more info about how to get the value, please see:
    https://docs.microsoft.com/azure/active-directory/develop/howto-create-service-principal-portal
"""


def main():
    client = AzureMediaServices(
        credential=DefaultAzureCredential(),
        subscription_id="00000000-0000-0000-0000-000000000000",
    )

    response = client.transforms.create_or_update(
        resource_group_name="contosoresources",
        account_name="contosomedia",
        transform_name="createdTransform",
        parameters={
            "properties": {
                "description": "Example Transform to illustrate create and update.",
                "outputs": [
                    {
                        "preset": {
                            "@odata.type": "#Microsoft.Media.BuiltInStandardEncoderPreset",
                            "presetName": "AdaptiveStreaming",
                        }
                    }
                ],
            }
        },
    )
    print(response)


# x-ms-original-file: specification/mediaservices/resource-manager/Microsoft.Media/Encoding/stable/2022-07-01/examples/transforms-create.json
if __name__ == "__main__":
    main()

To use the Azure SDK library in your project, see this documentation. To provide feedback on this code sample, open a GitHub issue

package armmediaservices_test

import (
	"context"
	"log"

	"github.com/Azure/azure-sdk-for-go/sdk/azcore/to"
	"github.com/Azure/azure-sdk-for-go/sdk/azidentity"
	"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/mediaservices/armmediaservices/v3"
)

// Generated from example definition: https://github.com/Azure/azure-rest-api-specs/blob/e7bf3adfa2d5e5cdbb804eec35279501794f461c/specification/mediaservices/resource-manager/Microsoft.Media/Encoding/stable/2022-07-01/examples/transforms-create.json
func ExampleTransformsClient_CreateOrUpdate() {
	cred, err := azidentity.NewDefaultAzureCredential(nil)
	if err != nil {
		log.Fatalf("failed to obtain a credential: %v", err)
	}
	ctx := context.Background()
	clientFactory, err := armmediaservices.NewClientFactory("<subscription-id>", cred, nil)
	if err != nil {
		log.Fatalf("failed to create client: %v", err)
	}
	res, err := clientFactory.NewTransformsClient().CreateOrUpdate(ctx, "contosoresources", "contosomedia", "createdTransform", armmediaservices.Transform{
		Properties: &armmediaservices.TransformProperties{
			Description: to.Ptr("Example Transform to illustrate create and update."),
			Outputs: []*armmediaservices.TransformOutput{
				{
					Preset: &armmediaservices.BuiltInStandardEncoderPreset{
						ODataType:  to.Ptr("#Microsoft.Media.BuiltInStandardEncoderPreset"),
						PresetName: to.Ptr(armmediaservices.EncoderNamedPresetAdaptiveStreaming),
					},
				}},
		},
	}, nil)
	if err != nil {
		log.Fatalf("failed to finish the request: %v", err)
	}
	// You could use response here. We use blank identifier for just demo purposes.
	_ = res
	// If the HTTP response code is 200 as defined in example definition, your response structure would look as follows. Please pay attention that all the values in the output are fake values for just demo purposes.
	// res.Transform = armmediaservices.Transform{
	// 	Name: to.Ptr("createdTransform"),
	// 	Type: to.Ptr("Microsoft.Media/mediaservices/transforms"),
	// 	ID: to.Ptr("/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contosoresources/providers/Microsoft.Media/mediaservices/contosomedia/transforms/createdTransform"),
	// 	Properties: &armmediaservices.TransformProperties{
	// 		Description: to.Ptr("Example Transform to illustrate create and update."),
	// 		Created: to.Ptr(func() time.Time { t, _ := time.Parse(time.RFC3339Nano, "2022-10-17T23:14:31.766Z"); return t}()),
	// 		LastModified: to.Ptr(func() time.Time { t, _ := time.Parse(time.RFC3339Nano, "2022-10-17T23:14:31.766Z"); return t}()),
	// 		Outputs: []*armmediaservices.TransformOutput{
	// 			{
	// 				OnError: to.Ptr(armmediaservices.OnErrorTypeStopProcessingJob),
	// 				Preset: &armmediaservices.BuiltInStandardEncoderPreset{
	// 					ODataType: to.Ptr("#Microsoft.Media.BuiltInStandardEncoderPreset"),
	// 					PresetName: to.Ptr(armmediaservices.EncoderNamedPresetAdaptiveStreaming),
	// 				},
	// 				RelativePriority: to.Ptr(armmediaservices.PriorityNormal),
	// 		}},
	// 	},
	// 	SystemData: &armmediaservices.SystemData{
	// 		CreatedAt: to.Ptr(func() time.Time { t, _ := time.Parse(time.RFC3339Nano, "2022-10-17T23:14:31.766Z"); return t}()),
	// 		CreatedBy: to.Ptr("contoso@microsoft.com"),
	// 		CreatedByType: to.Ptr(armmediaservices.CreatedByTypeUser),
	// 		LastModifiedAt: to.Ptr(func() time.Time { t, _ := time.Parse(time.RFC3339Nano, "2022-10-17T23:14:31.766Z"); return t}()),
	// 		LastModifiedBy: to.Ptr("contoso@microsoft.com"),
	// 		LastModifiedByType: to.Ptr(armmediaservices.CreatedByTypeUser),
	// 	},
	// }
}

To use the Azure SDK library in your project, see this documentation. To provide feedback on this code sample, open a GitHub issue

const { AzureMediaServices } = require("@azure/arm-mediaservices");
const { DefaultAzureCredential } = require("@azure/identity");

/**
 * This sample demonstrates how to Creates or updates a new Transform.
 *
 * @summary Creates or updates a new Transform.
 * x-ms-original-file: specification/mediaservices/resource-manager/Microsoft.Media/Encoding/stable/2022-07-01/examples/transforms-create.json
 */
async function createOrUpdateATransform() {
  const subscriptionId =
    process.env["MEDIASERVICES_SUBSCRIPTION_ID"] || "00000000-0000-0000-0000-000000000000";
  const resourceGroupName = process.env["MEDIASERVICES_RESOURCE_GROUP"] || "contosoresources";
  const accountName = "contosomedia";
  const transformName = "createdTransform";
  const parameters = {
    description: "Example Transform to illustrate create and update.",
    outputs: [
      {
        preset: {
          odataType: "#Microsoft.Media.BuiltInStandardEncoderPreset",
          presetName: "AdaptiveStreaming",
        },
      },
    ],
  };
  const credential = new DefaultAzureCredential();
  const client = new AzureMediaServices(credential, subscriptionId);
  const result = await client.transforms.createOrUpdate(
    resourceGroupName,
    accountName,
    transformName,
    parameters
  );
  console.log(result);
}

To use the Azure SDK library in your project, see this documentation. To provide feedback on this code sample, open a GitHub issue

using System;
using System.Threading.Tasks;
using Azure;
using Azure.Core;
using Azure.Identity;
using Azure.ResourceManager;
using Azure.ResourceManager.Media;
using Azure.ResourceManager.Media.Models;

// Generated from example definition: specification/mediaservices/resource-manager/Microsoft.Media/Encoding/stable/2022-07-01/examples/transforms-create.json
// this example is just showing the usage of "Transforms_CreateOrUpdate" operation, for the dependent resources, they will have to be created separately.

// get your azure access token, for more details of how Azure SDK get your access token, please refer to https://learn.microsoft.com/en-us/dotnet/azure/sdk/authentication?tabs=command-line
TokenCredential cred = new DefaultAzureCredential();
// authenticate your client
ArmClient client = new ArmClient(cred);

// this example assumes you already have this MediaServicesAccountResource created on azure
// for more information of creating MediaServicesAccountResource, please refer to the document of MediaServicesAccountResource
string subscriptionId = "00000000-0000-0000-0000-000000000000";
string resourceGroupName = "contosoresources";
string accountName = "contosomedia";
ResourceIdentifier mediaServicesAccountResourceId = MediaServicesAccountResource.CreateResourceIdentifier(subscriptionId, resourceGroupName, accountName);
MediaServicesAccountResource mediaServicesAccount = client.GetMediaServicesAccountResource(mediaServicesAccountResourceId);

// get the collection of this MediaTransformResource
MediaTransformCollection collection = mediaServicesAccount.GetMediaTransforms();

// invoke the operation
string transformName = "createdTransform";
MediaTransformData data = new MediaTransformData()
{
    Description = "Example Transform to illustrate create and update.",
    Outputs =
    {
    new MediaTransformOutput(new BuiltInStandardEncoderPreset(EncoderNamedPreset.AdaptiveStreaming))
    },
};
ArmOperation<MediaTransformResource> lro = await collection.CreateOrUpdateAsync(WaitUntil.Completed, transformName, data);
MediaTransformResource result = lro.Value;

// the variable result is a resource, you could call other operations on this instance as well
// but just for demo, we get its data from this resource instance
MediaTransformData resourceData = result.Data;
// for demo we just print out the id
Console.WriteLine($"Succeeded on id: {resourceData.Id}");

To use the Azure SDK library in your project, see this documentation. To provide feedback on this code sample, open a GitHub issue

Sample response

Status code:: 201

{
  "name": "createdTransform",
  "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contosoresources/providers/Microsoft.Media/mediaservices/contosomedia/transforms/createdTransform",
  "type": "Microsoft.Media/mediaservices/transforms",
  "properties": {
    "created": "2022-10-17T23:14:31.7664818Z",
    "description": "Example Transform to illustrate create and update.",
    "lastModified": "2022-10-17T23:14:31.7664818Z",
    "outputs": [
      {
        "onError": "StopProcessingJob",
        "relativePriority": "Normal",
        "preset": {
          "@odata.type": "#Microsoft.Media.BuiltInStandardEncoderPreset",
          "presetName": "AdaptiveStreaming"
        }
      }
    ]
  },
  "systemData": {
    "createdBy": "contoso@microsoft.com",
    "createdByType": "User",
    "createdAt": "2022-10-17T23:14:31.7664818Z",
    "lastModifiedBy": "contoso@microsoft.com",
    "lastModifiedByType": "User",
    "lastModifiedAt": "2022-10-17T23:14:31.7664818Z"
  }
}

Status code:: 200

{
  "name": "createdTransform",
  "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contosoresources/providers/Microsoft.Media/mediaservices/contosomedia/transforms/createdTransform",
  "type": "Microsoft.Media/mediaservices/transforms",
  "properties": {
    "created": "2022-10-17T23:14:31.7664818Z",
    "description": "Example Transform to illustrate create and update.",
    "lastModified": "2022-10-17T23:14:31.7664818Z",
    "outputs": [
      {
        "onError": "StopProcessingJob",
        "relativePriority": "Normal",
        "preset": {
          "@odata.type": "#Microsoft.Media.BuiltInStandardEncoderPreset",
          "presetName": "AdaptiveStreaming"
        }
      }
    ]
  },
  "systemData": {
    "createdBy": "contoso@microsoft.com",
    "createdByType": "User",
    "createdAt": "2022-10-17T23:14:31.7664818Z",
    "lastModifiedBy": "contoso@microsoft.com",
    "lastModifiedByType": "User",
    "lastModifiedAt": "2022-10-17T23:14:31.7664818Z"
  }
}

Definitions

Name	Description
AacAudio	Describes Advanced Audio Codec (AAC) audio encoding settings.
AacAudioProfile	The encoding profile to be used when encoding audio with AAC.
AnalysisResolution	Specifies the maximum resolution at which your video is analyzed. The default behavior is "SourceResolution," which will keep the input video at its original resolution when analyzed. Using "StandardDefinition" will resize input videos to standard definition while preserving the appropriate aspect ratio. It will only resize if the video is of higher resolution. For example, a 1920x1080 input would be scaled to 640x360 before processing. Switching to "StandardDefinition" will reduce the time it takes to process high resolution video. It may also reduce the cost of using this component (see https://azure.microsoft.com/en-us/pricing/details/media-services/#analytics for details). However, faces that end up being too small in the resized video may not be detected.
Audio	Defines the common properties for all audio codecs.
AudioAnalysisMode	Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen.
AudioAnalyzerPreset	The Audio Analyzer preset applies a pre-defined set of AI-based analysis operations, including speech transcription. Currently, the preset supports processing of content with a single audio track.
AudioOverlay	Describes the properties of an audio overlay.
BlurType	Blur type
BuiltInStandardEncoderPreset	Describes a built-in preset for encoding the input video with the Standard Encoder.
Complexity	Allows you to configure the encoder settings to control the balance between speed and quality. Example: set Complexity as Speed for faster encoding but less compression efficiency.
CopyAudio	A codec flag, which tells the encoder to copy the input audio bitstream.
CopyVideo	A codec flag, which tells the encoder to copy the input video bitstream without re-encoding.
createdByType	The type of identity that created the resource.
DDAudio	Describes Dolby Digital Audio Codec (AC3) audio encoding settings. The current implementation for Dolby Digital Audio support are: Audio channel numbers at 1((mono), 2(stereo), 6(5.1side); Audio sampling frequency rates at: 32K/44.1K/48K Hz; Audio bitrate values as AC3 specification supports: 32000, 40000, 48000, 56000, 64000, 80000, 96000, 112000, 128000, 160000, 192000, 224000, 256000, 320000, 384000, 448000, 512000, 576000, 640000 bps.
Deinterlace	Describes the de-interlacing settings.
DeinterlaceMode	The deinterlacing mode. Defaults to AutoPixelAdaptive.
DeinterlaceParity	The field parity for de-interlacing, defaults to Auto.
EncoderNamedPreset	The built-in preset to be used for encoding videos.
EntropyMode	The entropy mode to be used for this layer. If not specified, the encoder chooses the mode that is appropriate for the profile and level.
ErrorAdditionalInfo	The resource management error additional info.
ErrorDetail	The error detail.
ErrorResponse	Error response
FaceDetectorPreset	Describes all the settings to be used when analyzing a video in order to detect (and optionally redact) all the faces present.
FaceRedactorMode	This mode provides the ability to choose between the following settings: 1) Analyze - For detection only.This mode generates a metadata JSON file marking appearances of faces throughout the video.Where possible, appearances of the same person are assigned the same ID. 2) Combined - Additionally redacts(blurs) detected faces. 3) Redact - This enables a 2-pass process, allowing for selective redaction of a subset of detected faces.It takes in the metadata file from a prior analyze pass, along with the source video, and a user-selected subset of IDs that require redaction.
Fade	Describes the properties of a Fade effect applied to the input media.
Filters	Describes all the filtering operations, such as de-interlacing, rotation etc. that are to be applied to the input media before encoding.
H264Complexity	Tells the encoder how to choose its encoding settings. The default value is Balanced.
H264Layer	Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.264 video codec.
H264RateControlMode	The video rate control mode
H264Video	Describes all the properties for encoding a video with the H.264 codec.
H264VideoProfile	We currently support Baseline, Main, High, High422, High444. Default is Auto.
H265Complexity	Tells the encoder how to choose its encoding settings. Quality will provide for a higher compression ratio but at a higher cost and longer compute time. Speed will produce a relatively larger file but is faster and more economical. The default value is Balanced.
H265Layer	Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.265 video codec.
H265Video	Describes all the properties for encoding a video with the H.265 codec.
H265VideoProfile	We currently support Main. Default is Auto.
Image	Describes the basic properties for generating thumbnails from the input video
ImageFormat	Describes the properties for an output image file.
InsightsType	Defines the type of insights that you want the service to generate. The allowed values are 'AudioInsightsOnly', 'VideoInsightsOnly', and 'AllInsights'. The default is AllInsights. If you set this to AllInsights and the input is audio only, then only audio insights are generated. Similarly if the input is video only, then only video insights are generated. It is recommended that you not use AudioInsightsOnly if you expect some of your inputs to be video only; or use VideoInsightsOnly if you expect some of your inputs to be audio only. Your Jobs in such conditions would error out.
InterleaveOutput	Sets the interleave mode of the output to control how audio and video are stored in the container format. Example: set InterleavedOutput as NonInterleavedOutput to produce audio-only and video-only outputs in separate MP4 files.
JpgFormat	Describes the settings for producing JPEG thumbnails.
JpgImage	Describes the properties for producing a series of JPEG images from the input video.
JpgLayer	Describes the settings to produce a JPEG image from the input video.
Mp4Format	Describes the properties for an output ISO MP4 file.
MultiBitrateFormat	Describes the properties for producing a collection of GOP aligned multi-bitrate files. The default behavior is to produce one output file for each video layer which is muxed together with all the audios. The exact output files produced can be controlled by specifying the outputFiles collection.
OnErrorType	A Transform can define more than one outputs. This property defines what the service should do when one output fails - either continue to produce other outputs, or, stop the other outputs. The overall Job state will not reflect failures of outputs that are specified with 'ContinueJob'. The default is 'StopProcessingJob'.
OutputFile	Represents an output file produced.
PngFormat	Describes the settings for producing PNG thumbnails.
PngImage	Describes the properties for producing a series of PNG images from the input video.
PngLayer	Describes the settings to produce a PNG image from the input video.
PresetConfigurations	An object of optional configuration settings for encoder.
Priority	Sets the relative priority of the TransformOutputs within a Transform. This sets the priority that the service uses for processing TransformOutputs. The default priority is Normal.
Rectangle	Describes the properties of a rectangular window applied to the input media before processing it.
Rotation	The rotation, if any, to be applied to the input video, before it is encoded. Default is Auto
StandardEncoderPreset	Describes all the settings to be used when encoding the input video with the Standard Encoder.
StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
systemData	Metadata pertaining to creation and last modification of the resource.
Transform	A Transform encapsulates the rules or instructions for generating desired outputs from input media, such as by transcoding or by extracting insights. After the Transform is created, it can be applied to input media by creating Jobs.
TransformOutput	Describes the properties of a TransformOutput, which are the rules to be applied while generating the desired output.
TransportStreamFormat	Describes the properties for generating an MPEG-2 Transport Stream (ISO/IEC 13818-1) output video file(s).
Video	Describes the basic properties for encoding the input video.
VideoAnalyzerPreset	A video analyzer preset that extracts insights (rich metadata) from both audio and video, and outputs a JSON format file.
VideoOverlay	Describes the properties of a video overlay.
VideoSyncMode	The Video Sync Mode

AacAudio

Describes Advanced Audio Codec (AAC) audio encoding settings.

Name	Type	Description
@odata.type	string: #Microsoft.Media.AacAudio	The discriminator for derived types.
bitrate	integer	The bitrate, in bits per second, of the output encoded audio.
channels	integer	The number of channels in the audio.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
profile	AacAudioProfile	The encoding profile to be used when encoding audio with AAC.
samplingRate	integer	The sampling rate to use for encoding in hertz.

AacAudioProfile

The encoding profile to be used when encoding audio with AAC.

Name	Type	Description
AacLc	string	Specifies that the output audio is to be encoded into AAC Low Complexity profile (AAC-LC).
HeAacV1	string	Specifies that the output audio is to be encoded into HE-AAC v1 profile.
HeAacV2	string	Specifies that the output audio is to be encoded into HE-AAC v2 profile.

AnalysisResolution

Specifies the maximum resolution at which your video is analyzed. The default behavior is "SourceResolution," which will keep the input video at its original resolution when analyzed. Using "StandardDefinition" will resize input videos to standard definition while preserving the appropriate aspect ratio. It will only resize if the video is of higher resolution. For example, a 1920x1080 input would be scaled to 640x360 before processing. Switching to "StandardDefinition" will reduce the time it takes to process high resolution video. It may also reduce the cost of using this component (see https://azure.microsoft.com/en-us/pricing/details/media-services/#analytics for details). However, faces that end up being too small in the resized video may not be detected.

Name	Type	Description
SourceResolution	string
StandardDefinition	string

Audio

Defines the common properties for all audio codecs.

Name	Type	Description
@odata.type	string: #Microsoft.Media.Audio	The discriminator for derived types.
bitrate	integer	The bitrate, in bits per second, of the output encoded audio.
channels	integer	The number of channels in the audio.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
samplingRate	integer	The sampling rate to use for encoding in hertz.

AudioAnalysisMode

Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen.

Name	Type	Description
Basic	string	This mode performs speech-to-text transcription and generation of a VTT subtitle/caption file. The output of this mode includes an Insights JSON file including only the keywords, transcription,and timing information. Automatic language detection and speaker diarization are not included in this mode.
Standard	string	Performs all operations included in the Basic mode, additionally performing language detection and speaker diarization.

AudioAnalyzerPreset

The Audio Analyzer preset applies a pre-defined set of AI-based analysis operations, including speech transcription. Currently, the preset supports processing of content with a single audio track.

Name	Type	Description
@odata.type	string: #Microsoft.Media.AudioAnalyzerPreset	The discriminator for derived types.
audioLanguage	string	The language for the audio payload in the input using the BCP-47 format of 'language tag-region' (e.g: 'en-US'). If you know the language of your content, it is recommended that you specify it. The language must be specified explicitly for AudioAnalysisMode::Basic, since automatic language detection is not included in basic mode. If the language isn't specified or set to null, automatic language detection will choose the first language detected and process with the selected language for the duration of the file. It does not currently support dynamically switching between languages after the first language is detected. The automatic detection works best with audio recordings with clearly discernable speech. If automatic detection fails to find the language, transcription would fallback to 'en-US'." The list of supported languages is available here: https://go.microsoft.com/fwlink/?linkid=2109463
experimentalOptions	object	Dictionary containing key value pairs for parameters not exposed in the preset itself
mode	AudioAnalysisMode	Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen.

AudioOverlay

Describes the properties of an audio overlay.

Name	Type	Description
@odata.type	string: #Microsoft.Media.AudioOverlay	The discriminator for derived types.
audioGainLevel	number	The gain level of audio in the overlay. The value should be in the range [0, 1.0]. The default is 1.0.
end	string	The end position, with reference to the input video, at which the overlay ends. The value should be in ISO 8601 format. For example, PT30S to end the overlay at 30 seconds into the input video. If not specified or the value is greater than the input video duration, the overlay will be applied until the end of the input video if the overlay media duration is greater than the input video duration, else the overlay will last as long as the overlay media duration.
fadeInDuration	string	The duration over which the overlay fades in onto the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade in (same as PT0S).
fadeOutDuration	string	The duration over which the overlay fades out of the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade out (same as PT0S).
inputLabel	string	The label of the job input which is to be used as an overlay. The Input must specify exactly one file. You can specify an image file in JPG, PNG, GIF or BMP format, or an audio file (such as a WAV, MP3, WMA or M4A file), or a video file. See https://aka.ms/mesformats for the complete list of supported audio and video file formats.
start	string	The start position, with reference to the input video, at which the overlay starts. The value should be in ISO 8601 format. For example, PT05S to start the overlay at 5 seconds into the input video. If not specified the overlay starts from the beginning of the input video.

BlurType

Blur type

Name	Type	Description
Black	string	Black: Black out filter
Box	string	Box: debug filter, bounding box only
High	string	High: Confuse blur filter
Low	string	Low: box-car blur filter
Med	string	Med: Gaussian blur filter

BuiltInStandardEncoderPreset

Describes a built-in preset for encoding the input video with the Standard Encoder.

Name	Type	Description
@odata.type	string: #Microsoft.Media.BuiltInStandardEncoderPreset	The discriminator for derived types.
configurations	PresetConfigurations	Optional configuration settings for encoder. Configurations is only supported for ContentAwareEncoding and H265ContentAwareEncoding BuiltInStandardEncoderPreset.
presetName	EncoderNamedPreset	The built-in preset to be used for encoding videos.

Complexity

Allows you to configure the encoder settings to control the balance between speed and quality. Example: set Complexity as Speed for faster encoding but less compression efficiency.

Name	Type	Description
Balanced	string	Configures the encoder to use settings that achieve a balance between speed and quality.
Quality	string	Configures the encoder to use settings optimized to produce higher quality output at the expense of slower overall encode time.
Speed	string	Configures the encoder to use settings optimized for faster encoding. Quality is sacrificed to decrease encoding time.

CopyAudio

A codec flag, which tells the encoder to copy the input audio bitstream.

Name	Type	Description
@odata.type	string: #Microsoft.Media.CopyAudio	The discriminator for derived types.
label	string	An optional label for the codec. The label can be used to control muxing behavior.

CopyVideo

A codec flag, which tells the encoder to copy the input video bitstream without re-encoding.

Name	Type	Description
@odata.type	string: #Microsoft.Media.CopyVideo	The discriminator for derived types.
label	string	An optional label for the codec. The label can be used to control muxing behavior.

createdByType

The type of identity that created the resource.

Name	Type	Description
Application	string
Key	string
ManagedIdentity	string
User	string

DDAudio

Describes Dolby Digital Audio Codec (AC3) audio encoding settings. The current implementation for Dolby Digital Audio support are: Audio channel numbers at 1((mono), 2(stereo), 6(5.1side); Audio sampling frequency rates at: 32K/44.1K/48K Hz; Audio bitrate values as AC3 specification supports: 32000, 40000, 48000, 56000, 64000, 80000, 96000, 112000, 128000, 160000, 192000, 224000, 256000, 320000, 384000, 448000, 512000, 576000, 640000 bps.

Name	Type	Description
@odata.type	string: #Microsoft.Media.DDAudio	The discriminator for derived types.
bitrate	integer	The bitrate, in bits per second, of the output encoded audio.
channels	integer	The number of channels in the audio.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
samplingRate	integer	The sampling rate to use for encoding in hertz.

Deinterlace

Describes the de-interlacing settings.

Name	Type	Description
mode	DeinterlaceMode	The deinterlacing mode. Defaults to AutoPixelAdaptive.
parity	DeinterlaceParity	The field parity for de-interlacing, defaults to Auto.

DeinterlaceMode

The deinterlacing mode. Defaults to AutoPixelAdaptive.

Name	Type	Description
AutoPixelAdaptive	string	Apply automatic pixel adaptive de-interlacing on each frame in the input video.
Off	string	Disables de-interlacing of the source video.

DeinterlaceParity

The field parity for de-interlacing, defaults to Auto.

Name	Type	Description
Auto	string	Automatically detect the order of fields
BottomFieldFirst	string	Apply bottom field first processing of input video.
TopFieldFirst	string	Apply top field first processing of input video.

EncoderNamedPreset

The built-in preset to be used for encoding videos.

Name	Type	Description
AACGoodQualityAudio	string	Produces a single MP4 file containing only AAC stereo audio encoded at 192 kbps.
AdaptiveStreaming	string	Produces a set of GOP aligned MP4 files with H.264 video and stereo AAC audio. Auto-generates a bitrate ladder based on the input resolution, bitrate and frame rate. The auto-generated preset will never exceed the input resolution. For example, if the input is 720p, output will remain 720p at best.
ContentAwareEncoding	string	Produces a set of GOP-aligned MP4s by using content-aware encoding. Given any input content, the service performs an initial lightweight analysis of the input content, and uses the results to determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. This preset is particularly effective for low and medium complexity videos, where the output files will be at lower bitrates but at a quality that still delivers a good experience to viewers. The output will contain MP4 files with video and audio interleaved.
ContentAwareEncodingExperimental	string	Exposes an experimental preset for content-aware encoding. Given any input content, the service attempts to automatically determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. The underlying algorithms will continue to evolve over time. The output will contain MP4 files with video and audio interleaved.
CopyAllBitrateNonInterleaved	string	Copy all video and audio streams from the input asset as non-interleaved video and audio output files. This preset can be used to clip an existing asset or convert a group of key frame (GOP) aligned MP4 files as an asset that can be streamed.
DDGoodQualityAudio	string	Produces a single MP4 file containing only DD(Digital Dolby) stereo audio encoded at 192 kbps.
H264MultipleBitrate1080p	string	Produces a set of 8 GOP-aligned MP4 files, ranging from 6000 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 1080p and goes down to 180p.
H264MultipleBitrate720p	string	Produces a set of 6 GOP-aligned MP4 files, ranging from 3400 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 720p and goes down to 180p.
H264MultipleBitrateSD	string	Produces a set of 5 GOP-aligned MP4 files, ranging from 1900kbps to 400 kbps, and stereo AAC audio. Resolution starts at 480p and goes down to 240p.
H264SingleBitrate1080p	string	Produces an MP4 file where the video is encoded with H.264 codec at 6750 kbps and a picture height of 1080 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.
H264SingleBitrate720p	string	Produces an MP4 file where the video is encoded with H.264 codec at 4500 kbps and a picture height of 720 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.
H264SingleBitrateSD	string	Produces an MP4 file where the video is encoded with H.264 codec at 2200 kbps and a picture height of 480 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.
H265AdaptiveStreaming	string	Produces a set of GOP aligned MP4 files with H.265 video and stereo AAC audio. Auto-generates a bitrate ladder based on the input resolution, bitrate and frame rate. The auto-generated preset will never exceed the input resolution. For example, if the input is 720p, output will remain 720p at best.
H265ContentAwareEncoding	string	Produces a set of GOP-aligned MP4s by using content-aware encoding. Given any input content, the service performs an initial lightweight analysis of the input content, and uses the results to determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. This preset is particularly effective for low and medium complexity videos, where the output files will be at lower bitrates but at a quality that still delivers a good experience to viewers. The output will contain MP4 files with video and audio interleaved.
H265SingleBitrate1080p	string	Produces an MP4 file where the video is encoded with H.265 codec at 3500 kbps and a picture height of 1080 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.
H265SingleBitrate4K	string	Produces an MP4 file where the video is encoded with H.265 codec at 9500 kbps and a picture height of 2160 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.
H265SingleBitrate720p	string	Produces an MP4 file where the video is encoded with H.265 codec at 1800 kbps and a picture height of 720 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps.

EntropyMode

The entropy mode to be used for this layer. If not specified, the encoder chooses the mode that is appropriate for the profile and level.

Name	Type	Description
Cabac	string	Context Adaptive Binary Arithmetic Coder (CABAC) entropy encoding.
Cavlc	string	Context Adaptive Variable Length Coder (CAVLC) entropy encoding.

ErrorAdditionalInfo

The resource management error additional info.

Name	Type	Description
info	object	The additional info.
type	string	The additional info type.

ErrorDetail

The error detail.

Name	Type	Description
additionalInfo	ErrorAdditionalInfo[]	The error additional info.
code	string	The error code.
details	ErrorDetail[]	The error details.
message	string	The error message.
target	string	The error target.

ErrorResponse

Error response

Name	Type	Description
error	ErrorDetail	The error object.

FaceDetectorPreset

Describes all the settings to be used when analyzing a video in order to detect (and optionally redact) all the faces present.

Name	Type	Description
@odata.type	string: #Microsoft.Media.FaceDetectorPreset	The discriminator for derived types.
blurType	BlurType	Blur type
experimentalOptions	object	Dictionary containing key value pairs for parameters not exposed in the preset itself
mode	FaceRedactorMode	This mode provides the ability to choose between the following settings: 1) Analyze - For detection only.This mode generates a metadata JSON file marking appearances of faces throughout the video.Where possible, appearances of the same person are assigned the same ID. 2) Combined - Additionally redacts(blurs) detected faces. 3) Redact - This enables a 2-pass process, allowing for selective redaction of a subset of detected faces.It takes in the metadata file from a prior analyze pass, along with the source video, and a user-selected subset of IDs that require redaction.
resolution	AnalysisResolution	Specifies the maximum resolution at which your video is analyzed. The default behavior is "SourceResolution," which will keep the input video at its original resolution when analyzed. Using "StandardDefinition" will resize input videos to standard definition while preserving the appropriate aspect ratio. It will only resize if the video is of higher resolution. For example, a 1920x1080 input would be scaled to 640x360 before processing. Switching to "StandardDefinition" will reduce the time it takes to process high resolution video. It may also reduce the cost of using this component (see https://azure.microsoft.com/en-us/pricing/details/media-services/#analytics for details). However, faces that end up being too small in the resized video may not be detected.

FaceRedactorMode

This mode provides the ability to choose between the following settings: 1) Analyze - For detection only.This mode generates a metadata JSON file marking appearances of faces throughout the video.Where possible, appearances of the same person are assigned the same ID. 2) Combined - Additionally redacts(blurs) detected faces. 3) Redact - This enables a 2-pass process, allowing for selective redaction of a subset of detected faces.It takes in the metadata file from a prior analyze pass, along with the source video, and a user-selected subset of IDs that require redaction.

Name	Type	Description
Analyze	string	Analyze mode detects faces and outputs a metadata file with the results. Allows editing of the metadata file before faces are blurred with Redact mode.
Combined	string	Combined mode does the Analyze and Redact steps in one pass when editing the analyzed faces is not desired.
Redact	string	Redact mode consumes the metadata file from Analyze mode and redacts the faces found.

Fade

Describes the properties of a Fade effect applied to the input media.

Name	Type	Description
duration	string	The Duration of the fade effect in the video. The value can be in ISO 8601 format (For example, PT05S to fade In/Out a color during 5 seconds), or a frame count (For example, 10 to fade 10 frames from the start time), or a relative value to stream duration (For example, 10% to fade 10% of stream duration)
fadeColor	string	The Color for the fade In/Out. it can be on the CSS Level1 colors https://developer.mozilla.org/en-US/docs/Web/CSS/color_value/color_keywords or an RGB/hex value: e.g: rgb(255,0,0), 0xFF0000 or #FF0000
start	string	The position in the input video from where to start fade. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Default is 0

Filters

Describes all the filtering operations, such as de-interlacing, rotation etc. that are to be applied to the input media before encoding.

Name	Type	Description
crop	Rectangle	The parameters for the rectangular window with which to crop the input video.
deinterlace	Deinterlace	The de-interlacing settings.
fadeIn	Fade	Describes the properties of a Fade effect applied to the input media.
fadeOut	Fade	Describes the properties of a Fade effect applied to the input media.
overlays	Overlay[]: AudioOverlay[] VideoOverlay[]	The properties of overlays to be applied to the input video. These could be audio, image or video overlays.
rotation	Rotation	The rotation, if any, to be applied to the input video, before it is encoded. Default is Auto

H264Complexity

Tells the encoder how to choose its encoding settings. The default value is Balanced.

Name	Type	Description
Balanced	string	Tells the encoder to use settings that achieve a balance between speed and quality.
Quality	string	Tells the encoder to use settings that are optimized to produce higher quality output at the expense of slower overall encode time.
Speed	string	Tells the encoder to use settings that are optimized for faster encoding. Quality is sacrificed to decrease encoding time.

H264Layer

Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.264 video codec.

Name	Type	Description
adaptiveBFrame	boolean	Whether or not adaptive B-frames are to be used when encoding this layer. If not specified, the encoder will turn it on whenever the video profile permits its use.
bFrames	integer	The number of B-frames to be used when encoding this layer. If not specified, the encoder chooses an appropriate number based on the video profile and level.
bitrate	integer	The average bitrate in bits per second at which to encode the input video when generating this layer. This is a required field.
bufferWindow	string	The VBV buffer window length. The value should be in ISO 8601 format. The value should be in the range [0.1-100] seconds. The default is 5 seconds (for example, PT5S).
crf	number	The value of CRF to be used when encoding this layer. This setting takes effect when RateControlMode of video codec is set at CRF mode. The range of CRF value is between 0 and 51, where lower values would result in better quality, at the expense of higher file sizes. Higher values mean more compression, but at some point quality degradation will be noticed. Default value is 23.
entropyMode	EntropyMode	The entropy mode to be used for this layer. If not specified, the encoder chooses the mode that is appropriate for the profile and level.
frameRate	string	The frame rate (in frames per second) at which to encode this layer. The value can be in the form of M/N where M and N are integers (For example, 30000/1001), or in the form of a number (For example, 30, or 29.97). The encoder enforces constraints on allowed frame rates based on the profile and level. If it is not specified, the encoder will use the same frame rate as the input video.
height	string	The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input.
label	string	The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file.
level	string	We currently support Level up to 6.2. The value can be Auto, or a number that matches the H.264 profile. If not specified, the default is Auto, which lets the encoder choose the Level that is appropriate for this layer.
maxBitrate	integer	The maximum bitrate (in bits per second), at which the VBV buffer should be assumed to refill. If not specified, defaults to the same value as bitrate.
profile	H264VideoProfile	We currently support Baseline, Main, High, High422, High444. Default is Auto.
referenceFrames	integer	The number of reference frames to be used when encoding this layer. If not specified, the encoder determines an appropriate number based on the encoder complexity setting.
slices	integer	The number of slices to be used when encoding this layer. If not specified, default is zero, which means that encoder will use a single slice for each frame.
width	string	The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input.

H264RateControlMode

The video rate control mode

Name	Type	Description
ABR	string	Average Bitrate (ABR) mode that hits the target bitrate: Default mode.
CBR	string	Constant Bitrate (CBR) mode that tightens bitrate variations around target bitrate.
CRF	string	Constant Rate Factor (CRF) mode that targets at constant subjective quality.

H264Video

Describes all the properties for encoding a video with the H.264 codec.

Name	Type	Description
@odata.type	string: #Microsoft.Media.H264Video	The discriminator for derived types.
complexity	H264Complexity	Tells the encoder how to choose its encoding settings. The default value is Balanced.
keyFrameInterval	string	The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
layers	H264Layer[]	The collection of output H.264 layers to be produced by the encoder.
rateControlMode	H264RateControlMode	The video rate control mode
sceneChangeDetection	boolean	Whether or not the encoder should insert key frames at scene changes. If not specified, the default is false. This flag should be set to true only when the encoder is being configured to produce a single output video.
stretchMode	StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
syncMode	VideoSyncMode	The Video Sync Mode

H264VideoProfile

We currently support Baseline, Main, High, High422, High444. Default is Auto.

Name	Type	Description
Auto	string	Tells the encoder to automatically determine the appropriate H.264 profile.
Baseline	string	Baseline profile
High	string	High profile.
High422	string	High 4:2:2 profile.
High444	string	High 4:4:4 predictive profile.
Main	string	Main profile

H265Complexity

Tells the encoder how to choose its encoding settings. Quality will provide for a higher compression ratio but at a higher cost and longer compute time. Speed will produce a relatively larger file but is faster and more economical. The default value is Balanced.

Name	Type	Description
Balanced	string	Tells the encoder to use settings that achieve a balance between speed and quality.
Quality	string	Tells the encoder to use settings that are optimized to produce higher quality output at the expense of slower overall encode time.
Speed	string	Tells the encoder to use settings that are optimized for faster encoding. Quality is sacrificed to decrease encoding time.

H265Layer

Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.265 video codec.

Name	Type	Description
adaptiveBFrame	boolean	Specifies whether or not adaptive B-frames are to be used when encoding this layer. If not specified, the encoder will turn it on whenever the video profile permits its use.
bFrames	integer	The number of B-frames to be used when encoding this layer. If not specified, the encoder chooses an appropriate number based on the video profile and level.
bitrate	integer	The average bitrate in bits per second at which to encode the input video when generating this layer. For example: a target bitrate of 3000Kbps or 3Mbps means this value should be 3000000 This is a required field.
bufferWindow	string	The VBV buffer window length. The value should be in ISO 8601 format. The value should be in the range [0.1-100] seconds. The default is 5 seconds (for example, PT5S).
crf	number	The value of CRF to be used when encoding this layer. This setting takes effect when RateControlMode of video codec is set at CRF mode. The range of CRF value is between 0 and 51, where lower values would result in better quality, at the expense of higher file sizes. Higher values mean more compression, but at some point quality degradation will be noticed. Default value is 28.
frameRate	string	The frame rate (in frames per second) at which to encode this layer. The value can be in the form of M/N where M and N are integers (For example, 30000/1001), or in the form of a number (For example, 30, or 29.97). The encoder enforces constraints on allowed frame rates based on the profile and level. If it is not specified, the encoder will use the same frame rate as the input video.
height	string	The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input.
label	string	The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file.
level	string	We currently support Level up to 6.2. The value can be Auto, or a number that matches the H.265 profile. If not specified, the default is Auto, which lets the encoder choose the Level that is appropriate for this layer.
maxBitrate	integer	The maximum bitrate (in bits per second), at which the VBV buffer should be assumed to refill. If not specified, defaults to the same value as bitrate.
profile	H265VideoProfile	We currently support Main. Default is Auto.
referenceFrames	integer	The number of reference frames to be used when encoding this layer. If not specified, the encoder determines an appropriate number based on the encoder complexity setting.
slices	integer	The number of slices to be used when encoding this layer. If not specified, default is zero, which means that encoder will use a single slice for each frame.
width	string	The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input.

H265Video

Describes all the properties for encoding a video with the H.265 codec.

Name	Type	Description
@odata.type	string: #Microsoft.Media.H265Video	The discriminator for derived types.
complexity	H265Complexity	Tells the encoder how to choose its encoding settings. Quality will provide for a higher compression ratio but at a higher cost and longer compute time. Speed will produce a relatively larger file but is faster and more economical. The default value is Balanced.
keyFrameInterval	string	The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
layers	H265Layer[]	The collection of output H.265 layers to be produced by the encoder.
sceneChangeDetection	boolean	Specifies whether or not the encoder should insert key frames at scene changes. If not specified, the default is false. This flag should be set to true only when the encoder is being configured to produce a single output video.
stretchMode	StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
syncMode	VideoSyncMode	The Video Sync Mode

H265VideoProfile

We currently support Main. Default is Auto.

Name	Type	Description
Auto	string	Tells the encoder to automatically determine the appropriate H.265 profile.
Main	string	Main profile (https://x265.readthedocs.io/en/default/cli.html?highlight=profile#profile-level-tier)
Main10	string	Main 10 profile (https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding#Main_10)

Image

Describes the basic properties for generating thumbnails from the input video

Name	Type	Description
@odata.type	string: #Microsoft.Media.Image	The discriminator for derived types.
keyFrameInterval	string	The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
range	string	The position relative to transform preset start time in the input video at which to stop generating thumbnails. The value can be in ISO 8601 format (For example, PT5M30S to stop at 5 minutes and 30 seconds from start time), or a frame count (For example, 300 to stop at the 300th frame from the frame at start time. If this value is 1, it means only producing one thumbnail at start time), or a relative value to the stream duration (For example, 50% to stop at half of stream duration from start time). The default value is 100%, which means to stop at the end of the stream.
start	string	The position in the input video from where to start generating thumbnails. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Also supports a macro {Best}, which tells the encoder to select the best thumbnail from the first few seconds of the video and will only produce one thumbnail, no matter what other settings are for Step and Range. The default value is macro {Best}.
step	string	The intervals at which thumbnails are generated. The value can be in ISO 8601 format (For example, PT05S for one image every 5 seconds), or a frame count (For example, 30 for one image every 30 frames), or a relative value to stream duration (For example, 10% for one image every 10% of stream duration). Note: Step value will affect the first generated thumbnail, which may not be exactly the one specified at transform preset start time. This is due to the encoder, which tries to select the best thumbnail between start time and Step position from start time as the first output. As the default value is 10%, it means if stream has long duration, the first generated thumbnail might be far away from the one specified at start time. Try to select reasonable value for Step if the first thumbnail is expected close to start time, or set Range value at 1 if only one thumbnail is needed at start time.
stretchMode	StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
syncMode	VideoSyncMode	The Video Sync Mode

ImageFormat

Describes the properties for an output image file.

Name	Type	Description
@odata.type	string: #Microsoft.Media.ImageFormat	The discriminator for derived types.
filenamePattern	string	The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename.

InsightsType

Defines the type of insights that you want the service to generate. The allowed values are 'AudioInsightsOnly', 'VideoInsightsOnly', and 'AllInsights'. The default is AllInsights. If you set this to AllInsights and the input is audio only, then only audio insights are generated. Similarly if the input is video only, then only video insights are generated. It is recommended that you not use AudioInsightsOnly if you expect some of your inputs to be video only; or use VideoInsightsOnly if you expect some of your inputs to be audio only. Your Jobs in such conditions would error out.

Name	Type	Description
AllInsights	string	Generate both audio and video insights. Fails if either audio or video Insights fail.
AudioInsightsOnly	string	Generate audio only insights. Ignore video even if present. Fails if no audio is present.
VideoInsightsOnly	string	Generate video only insights. Ignore audio if present. Fails if no video is present.

InterleaveOutput

Sets the interleave mode of the output to control how audio and video are stored in the container format. Example: set InterleavedOutput as NonInterleavedOutput to produce audio-only and video-only outputs in separate MP4 files.

Name	Type	Description
InterleavedOutput	string	The output includes both audio and video.
NonInterleavedOutput	string	The output is video-only or audio-only.

JpgFormat

Describes the settings for producing JPEG thumbnails.

Name	Type	Description
@odata.type	string: #Microsoft.Media.JpgFormat	The discriminator for derived types.
filenamePattern	string	The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename.

JpgImage

Describes the properties for producing a series of JPEG images from the input video.

Name	Type	Description
@odata.type	string: #Microsoft.Media.JpgImage	The discriminator for derived types.
keyFrameInterval	string	The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
layers	JpgLayer[]	A collection of output JPEG image layers to be produced by the encoder.
range	string	The position relative to transform preset start time in the input video at which to stop generating thumbnails. The value can be in ISO 8601 format (For example, PT5M30S to stop at 5 minutes and 30 seconds from start time), or a frame count (For example, 300 to stop at the 300th frame from the frame at start time. If this value is 1, it means only producing one thumbnail at start time), or a relative value to the stream duration (For example, 50% to stop at half of stream duration from start time). The default value is 100%, which means to stop at the end of the stream.
spriteColumn	integer	Sets the number of columns used in thumbnail sprite image. The number of rows are automatically calculated and a VTT file is generated with the coordinate mappings for each thumbnail in the sprite. Note: this value should be a positive integer and a proper value is recommended so that the output image resolution will not go beyond JPEG maximum pixel resolution limit 65535x65535.
start	string	The position in the input video from where to start generating thumbnails. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Also supports a macro {Best}, which tells the encoder to select the best thumbnail from the first few seconds of the video and will only produce one thumbnail, no matter what other settings are for Step and Range. The default value is macro {Best}.
step	string	The intervals at which thumbnails are generated. The value can be in ISO 8601 format (For example, PT05S for one image every 5 seconds), or a frame count (For example, 30 for one image every 30 frames), or a relative value to stream duration (For example, 10% for one image every 10% of stream duration). Note: Step value will affect the first generated thumbnail, which may not be exactly the one specified at transform preset start time. This is due to the encoder, which tries to select the best thumbnail between start time and Step position from start time as the first output. As the default value is 10%, it means if stream has long duration, the first generated thumbnail might be far away from the one specified at start time. Try to select reasonable value for Step if the first thumbnail is expected close to start time, or set Range value at 1 if only one thumbnail is needed at start time.
stretchMode	StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
syncMode	VideoSyncMode	The Video Sync Mode

JpgLayer

Describes the settings to produce a JPEG image from the input video.

Name	Type	Description
height	string	The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input.
label	string	The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file.
quality	integer	The compression quality of the JPEG output. Range is from 0-100 and the default is 70.
width	string	The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input.

Mp4Format

Describes the properties for an output ISO MP4 file.

Name	Type	Description
@odata.type	string: #Microsoft.Media.Mp4Format	The discriminator for derived types.
filenamePattern	string	The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename.
outputFiles	OutputFile[]	The list of output files to produce. Each entry in the list is a set of audio and video layer labels to be muxed together .

MultiBitrateFormat

Describes the properties for producing a collection of GOP aligned multi-bitrate files. The default behavior is to produce one output file for each video layer which is muxed together with all the audios. The exact output files produced can be controlled by specifying the outputFiles collection.

Name	Type	Description
@odata.type	string: #Microsoft.Media.MultiBitrateFormat	The discriminator for derived types.
filenamePattern	string	The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename.
outputFiles	OutputFile[]	The list of output files to produce. Each entry in the list is a set of audio and video layer labels to be muxed together .

OnErrorType

A Transform can define more than one outputs. This property defines what the service should do when one output fails - either continue to produce other outputs, or, stop the other outputs. The overall Job state will not reflect failures of outputs that are specified with 'ContinueJob'. The default is 'StopProcessingJob'.

Name	Type	Description
ContinueJob	string	Tells the service that if this TransformOutput fails, then allow any other TransformOutput to continue.
StopProcessingJob	string	Tells the service that if this TransformOutput fails, then any other incomplete TransformOutputs can be stopped.

OutputFile

Represents an output file produced.

Name	Type	Description
labels	string[]	The list of labels that describe how the encoder should multiplex video and audio into an output file. For example, if the encoder is producing two video layers with labels v1 and v2, and one audio layer with label a1, then an array like '[v1, a1]' tells the encoder to produce an output file with the video track represented by v1 and the audio track represented by a1.

PngFormat

Describes the settings for producing PNG thumbnails.

Name	Type	Description
@odata.type	string: #Microsoft.Media.PngFormat	The discriminator for derived types.
filenamePattern	string	The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename.

PngImage

Describes the properties for producing a series of PNG images from the input video.

Name	Type	Description
@odata.type	string: #Microsoft.Media.PngImage	The discriminator for derived types.
keyFrameInterval	string	The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
layers	PngLayer[]	A collection of output PNG image layers to be produced by the encoder.
range	string	The position relative to transform preset start time in the input video at which to stop generating thumbnails. The value can be in ISO 8601 format (For example, PT5M30S to stop at 5 minutes and 30 seconds from start time), or a frame count (For example, 300 to stop at the 300th frame from the frame at start time. If this value is 1, it means only producing one thumbnail at start time), or a relative value to the stream duration (For example, 50% to stop at half of stream duration from start time). The default value is 100%, which means to stop at the end of the stream.
start	string	The position in the input video from where to start generating thumbnails. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Also supports a macro {Best}, which tells the encoder to select the best thumbnail from the first few seconds of the video and will only produce one thumbnail, no matter what other settings are for Step and Range. The default value is macro {Best}.
step	string	The intervals at which thumbnails are generated. The value can be in ISO 8601 format (For example, PT05S for one image every 5 seconds), or a frame count (For example, 30 for one image every 30 frames), or a relative value to stream duration (For example, 10% for one image every 10% of stream duration). Note: Step value will affect the first generated thumbnail, which may not be exactly the one specified at transform preset start time. This is due to the encoder, which tries to select the best thumbnail between start time and Step position from start time as the first output. As the default value is 10%, it means if stream has long duration, the first generated thumbnail might be far away from the one specified at start time. Try to select reasonable value for Step if the first thumbnail is expected close to start time, or set Range value at 1 if only one thumbnail is needed at start time.
stretchMode	StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
syncMode	VideoSyncMode	The Video Sync Mode

PngLayer

Describes the settings to produce a PNG image from the input video.

Name	Type	Description
height	string	The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input.
label	string	The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file.
width	string	The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input.

PresetConfigurations

An object of optional configuration settings for encoder.

Name	Type	Description
complexity	Complexity	Allows you to configure the encoder settings to control the balance between speed and quality. Example: set Complexity as Speed for faster encoding but less compression efficiency.
interleaveOutput	InterleaveOutput	Sets the interleave mode of the output to control how audio and video are stored in the container format. Example: set InterleavedOutput as NonInterleavedOutput to produce audio-only and video-only outputs in separate MP4 files.
keyFrameIntervalInSeconds	number	The key frame interval in seconds. Example: set KeyFrameIntervalInSeconds as 2 to reduce the playback buffering for some players.
maxBitrateBps	integer	The maximum bitrate in bits per second (threshold for the top video layer). Example: set MaxBitrateBps as 6000000 to avoid producing very high bitrate outputs for contents with high complexity.
maxHeight	integer	The maximum height of output video layers. Example: set MaxHeight as 720 to produce output layers up to 720P even if the input is 4K.
maxLayers	integer	The maximum number of output video layers. Example: set MaxLayers as 4 to make sure at most 4 output layers are produced to control the overall cost of the encoding job.
minBitrateBps	integer	The minimum bitrate in bits per second (threshold for the bottom video layer). Example: set MinBitrateBps as 200000 to have a bottom layer that covers users with low network bandwidth.
minHeight	integer	The minimum height of output video layers. Example: set MinHeight as 360 to avoid output layers of smaller resolutions like 180P.

Priority

Sets the relative priority of the TransformOutputs within a Transform. This sets the priority that the service uses for processing TransformOutputs. The default priority is Normal.

Name	Type	Description
High	string	Used for TransformOutputs that should take precedence over others.
Low	string	Used for TransformOutputs that can be generated after Normal and High priority TransformOutputs.
Normal	string	Used for TransformOutputs that can be generated at Normal priority.

Rectangle

Describes the properties of a rectangular window applied to the input media before processing it.

Name	Type	Description
height	string	The height of the rectangular region in pixels. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%).
left	string	The number of pixels from the left-margin. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%).
top	string	The number of pixels from the top-margin. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%).
width	string	The width of the rectangular region in pixels. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%).

Rotation

The rotation, if any, to be applied to the input video, before it is encoded. Default is Auto

Name	Type	Description
Auto	string	Automatically detect and rotate as needed.
None	string	Do not rotate the video. If the output format supports it, any metadata about rotation is kept intact.
Rotate0	string	Do not rotate the video but remove any metadata about the rotation.
Rotate180	string	Rotate 180 degrees clockwise.
Rotate270	string	Rotate 270 degrees clockwise.
Rotate90	string	Rotate 90 degrees clockwise.

StandardEncoderPreset

Describes all the settings to be used when encoding the input video with the Standard Encoder.

Name	Type	Description
@odata.type	string: #Microsoft.Media.StandardEncoderPreset	The discriminator for derived types.
codecs	Codec[]: AacAudio[] Audio[] CopyAudio[] CopyVideo[] DDAudio[] H264Video[] H265Video[] Image[] JpgImage[] PngImage[] Video[]	The list of codecs to be used when encoding the input video.
experimentalOptions	object	Dictionary containing key value pairs for parameters not exposed in the preset itself
filters	Filters	One or more filtering operations that are applied to the input media before encoding.
formats	Format[]: ImageFormat[] JpgFormat[] Mp4Format[] MultiBitrateFormat[] PngFormat[] TransportStreamFormat[]	The list of outputs to be produced by the encoder.

StretchMode

The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize

Name	Type	Description
AutoFit	string	Pad the output (with either letterbox or pillar box) to honor the output resolution, while ensuring that the active video region in the output has the same aspect ratio as the input. For example, if the input is 1920x1080 and the encoding preset asks for 1280x1280, then the output will be at 1280x1280, which contains an inner rectangle of 1280x720 at aspect ratio of 16:9, and pillar box regions 280 pixels wide at the left and right.
AutoSize	string	Override the output resolution, and change it to match the display aspect ratio of the input, without padding. For example, if the input is 1920x1080 and the encoding preset asks for 1280x1280, then the value in the preset is overridden, and the output will be at 1280x720, which maintains the input aspect ratio of 16:9.
None	string	Strictly respect the output resolution without considering the pixel aspect ratio or display aspect ratio of the input video.

systemData

Metadata pertaining to creation and last modification of the resource.

Name	Type	Description
createdAt	string	The timestamp of resource creation (UTC).
createdBy	string	The identity that created the resource.
createdByType	createdByType	The type of identity that created the resource.
lastModifiedAt	string	The timestamp of resource last modification (UTC)
lastModifiedBy	string	The identity that last modified the resource.
lastModifiedByType	createdByType	The type of identity that last modified the resource.

Transform

A Transform encapsulates the rules or instructions for generating desired outputs from input media, such as by transcoding or by extracting insights. After the Transform is created, it can be applied to input media by creating Jobs.

Name	Type	Description
id	string	Fully qualified resource ID for the resource. Ex - /subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/{resourceProviderNamespace}/{resourceType}/{resourceName}
name	string	The name of the resource
properties.created	string	The UTC date and time when the Transform was created, in 'YYYY-MM-DDThh:mm:ssZ' format.
properties.description	string	An optional verbose description of the Transform.
properties.lastModified	string	The UTC date and time when the Transform was last updated, in 'YYYY-MM-DDThh:mm:ssZ' format.
properties.outputs	TransformOutput[]	An array of one or more TransformOutputs that the Transform should generate.
systemData	systemData	The system metadata relating to this resource.
type	string	The type of the resource. E.g. "Microsoft.Compute/virtualMachines" or "Microsoft.Storage/storageAccounts"

TransformOutput

Describes the properties of a TransformOutput, which are the rules to be applied while generating the desired output.

Name	Type	Description
onError	OnErrorType	A Transform can define more than one outputs. This property defines what the service should do when one output fails - either continue to produce other outputs, or, stop the other outputs. The overall Job state will not reflect failures of outputs that are specified with 'ContinueJob'. The default is 'StopProcessingJob'.
preset	Preset: AudioAnalyzerPreset BuiltInStandardEncoderPreset FaceDetectorPreset StandardEncoderPreset VideoAnalyzerPreset	Preset that describes the operations that will be used to modify, transcode, or extract insights from the source file to generate the output.
relativePriority	Priority	Sets the relative priority of the TransformOutputs within a Transform. This sets the priority that the service uses for processing TransformOutputs. The default priority is Normal.

TransportStreamFormat

Describes the properties for generating an MPEG-2 Transport Stream (ISO/IEC 13818-1) output video file(s).

Name	Type	Description
@odata.type	string: #Microsoft.Media.TransportStreamFormat	The discriminator for derived types.
filenamePattern	string	The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename.
outputFiles	OutputFile[]	The list of output files to produce. Each entry in the list is a set of audio and video layer labels to be muxed together .

Video

Describes the basic properties for encoding the input video.

Name	Type	Description
@odata.type	string: #Microsoft.Media.Video	The discriminator for derived types.
keyFrameInterval	string	The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting.
label	string	An optional label for the codec. The label can be used to control muxing behavior.
stretchMode	StretchMode	The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
syncMode	VideoSyncMode	The Video Sync Mode

VideoAnalyzerPreset

A video analyzer preset that extracts insights (rich metadata) from both audio and video, and outputs a JSON format file.

Name	Type	Description
@odata.type	string: #Microsoft.Media.VideoAnalyzerPreset	The discriminator for derived types.
audioLanguage	string	The language for the audio payload in the input using the BCP-47 format of 'language tag-region' (e.g: 'en-US'). If you know the language of your content, it is recommended that you specify it. The language must be specified explicitly for AudioAnalysisMode::Basic, since automatic language detection is not included in basic mode. If the language isn't specified or set to null, automatic language detection will choose the first language detected and process with the selected language for the duration of the file. It does not currently support dynamically switching between languages after the first language is detected. The automatic detection works best with audio recordings with clearly discernable speech. If automatic detection fails to find the language, transcription would fallback to 'en-US'." The list of supported languages is available here: https://go.microsoft.com/fwlink/?linkid=2109463
experimentalOptions	object	Dictionary containing key value pairs for parameters not exposed in the preset itself
insightsToExtract	InsightsType	Defines the type of insights that you want the service to generate. The allowed values are 'AudioInsightsOnly', 'VideoInsightsOnly', and 'AllInsights'. The default is AllInsights. If you set this to AllInsights and the input is audio only, then only audio insights are generated. Similarly if the input is video only, then only video insights are generated. It is recommended that you not use AudioInsightsOnly if you expect some of your inputs to be video only; or use VideoInsightsOnly if you expect some of your inputs to be audio only. Your Jobs in such conditions would error out.
mode	AudioAnalysisMode	Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen.

VideoOverlay

Describes the properties of a video overlay.

Name	Type	Description
@odata.type	string: #Microsoft.Media.VideoOverlay	The discriminator for derived types.
audioGainLevel	number	The gain level of audio in the overlay. The value should be in the range [0, 1.0]. The default is 1.0.
cropRectangle	Rectangle	An optional rectangular window used to crop the overlay image or video.
end	string	The end position, with reference to the input video, at which the overlay ends. The value should be in ISO 8601 format. For example, PT30S to end the overlay at 30 seconds into the input video. If not specified or the value is greater than the input video duration, the overlay will be applied until the end of the input video if the overlay media duration is greater than the input video duration, else the overlay will last as long as the overlay media duration.
fadeInDuration	string	The duration over which the overlay fades in onto the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade in (same as PT0S).
fadeOutDuration	string	The duration over which the overlay fades out of the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade out (same as PT0S).
inputLabel	string	The label of the job input which is to be used as an overlay. The Input must specify exactly one file. You can specify an image file in JPG, PNG, GIF or BMP format, or an audio file (such as a WAV, MP3, WMA or M4A file), or a video file. See https://aka.ms/mesformats for the complete list of supported audio and video file formats.
opacity	number	The opacity of the overlay. This is a value in the range [0 - 1.0]. Default is 1.0 which mean the overlay is opaque.
position	Rectangle	The location in the input video where the overlay is applied.
start	string	The start position, with reference to the input video, at which the overlay starts. The value should be in ISO 8601 format. For example, PT05S to start the overlay at 5 seconds into the input video. If not specified the overlay starts from the beginning of the input video.

VideoSyncMode

The Video Sync Mode

Name	Type	Description
Auto	string	This is the default method. Chooses between Cfr and Vfr depending on muxer capabilities. For output format MP4, the default mode is Cfr.
Cfr	string	Input frames will be repeated and/or dropped as needed to achieve exactly the requested constant frame rate. Recommended when the output frame rate is explicitly set at a specified value
Passthrough	string	The presentation timestamps on frames are passed through from the input file to the output file writer. Recommended when the input source has variable frame rate, and are attempting to produce multiple layers for adaptive streaming in the output which have aligned GOP boundaries. Note: if two or more frames in the input have duplicate timestamps, then the output will also have the same behavior
Vfr	string	Similar to the Passthrough mode, but if the input has frames that have duplicate timestamps, then only one frame is passed through to the output, and others are dropped. Recommended when the number of output frames is expected to be equal to the number of input frames. For example, the output is used to calculate a quality metric like PSNR against the input