Microsoft.MachineLearningServices workspaces/onlineEndpoints/deployments 2021-03-01-preview

Bicep resource definition

The workspaces/onlineEndpoints/deployments resource type can be deployed with operations that target:

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments resource, add the following Bicep to your template.

resource symbolicname 'Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments@2021-03-01-preview' = {
  name: 'string'
  location: 'string'
  tags: {
    tagName1: 'tagValue1'
    tagName2: 'tagValue2'
  }
  kind: 'string'
  parent: resourceSymbolicName
  identity: {
    type: 'string'
    userAssignedIdentities: {
      {customized property}: {
        clientId: 'string'
        principalId: 'string'
      }
    }
  }
  properties: {
    appInsightsEnabled: bool
    codeConfiguration: {
      codeId: 'string'
      scoringScript: 'string'
    }
    description: 'string'
    environmentId: 'string'
    environmentVariables: {
      {customized property}: 'string'
    }
    livenessProbe: {
      failureThreshold: int
      initialDelay: 'string'
      period: 'string'
      successThreshold: int
      timeout: 'string'
    }
    model: {
      referenceType: 'string'
      // For remaining properties, see AssetReferenceBase objects
    }
    properties: {
      {customized property}: 'string'
    }
    requestSettings: {
      maxConcurrentRequestsPerInstance: int
      maxQueueWait: 'string'
      requestTimeout: 'string'
    }
    scaleSettings: {
      maxInstances: int
      minInstances: int
      scaleType: 'string'
      // For remaining properties, see OnlineScaleSettings objects
    }
    endpointComputeType: 'string'
    // For remaining properties, see OnlineDeployment objects
  }
}

OnlineDeployment objects

Set the endpointComputeType property to specify the type of object.

For K8S, use:

  endpointComputeType: 'K8S'
  containerResourceRequirements: {
    cpu: int
    cpuLimit: int
    fpga: int
    gpu: int
    memoryInGB: int
    memoryInGBLimit: int
  }

For Managed, use:

  endpointComputeType: 'Managed'
  instanceType: 'string'
  readinessProbe: {
    failureThreshold: int
    initialDelay: 'string'
    period: 'string'
    successThreshold: int
    timeout: 'string'
  }

AssetReferenceBase objects

Set the referenceType property to specify the type of object.

For DataPath, use:

  referenceType: 'DataPath'
  datastoreId: 'string'
  path: 'string'

For Id, use:

  referenceType: 'Id'
  assetId: 'string'

For OutputPath, use:

  referenceType: 'OutputPath'
  jobId: 'string'
  path: 'string'

OnlineScaleSettings objects

Set the scaleType property to specify the type of object.

For Auto, use:

  scaleType: 'Auto'
  pollingInterval: 'string'
  targetUtilizationPercentage: int

For Manual, use:

  scaleType: 'Manual'
  instanceCount: int

Property values

workspaces/onlineEndpoints/deployments

Name Description Value
name The resource name

See how to set names and types for child resources in Bicep.
string (required)
location The geo-location where the resource lives string (required)
tags Resource tags. Dictionary of tag names and values. See Tags in templates
kind Metadata used by portal/tooling/etc to render different UX experiences for resources of the same type. string
parent In Bicep, you can specify the parent resource for a child resource. You only need to add this property when the child resource is declared outside of the parent resource.

For more information, see Child resource outside parent resource.
Symbolic name for resource of type: onlineEndpoints
identity Service identity associated with a resource. ResourceIdentity
properties [Required] Additional attributes of the entity. OnlineDeployment (required)

ResourceIdentity

Name Description Value
type Defines values for a ResourceIdentity's type. 'None'
'SystemAssigned'
'SystemAssigned,UserAssigned'
'UserAssigned'
userAssignedIdentities Dictionary of the user assigned identities, key is ARM resource ID of the UAI. ResourceIdentityUserAssignedIdentities

ResourceIdentityUserAssignedIdentities

Name Description Value
{customized property} UserAssignedIdentityMeta

UserAssignedIdentityMeta

Name Description Value
clientId Aka application ID, a unique identifier generated by Azure AD that is tied to an application and service principal during its initial provisioning. string
principalId The object ID of the service principal object for your managed identity that is used to grant role-based access to an Azure resource. string

OnlineDeployment

Name Description Value
appInsightsEnabled If true, enables Application Insights logging. bool
codeConfiguration Code configuration for the endpoint deployment. CodeConfiguration
description Description of the endpoint deployment. string
environmentId ARM resource ID of the environment specification for the endpoint deployment. string
environmentVariables Environment variables configuration for the deployment. OnlineDeploymentEnvironmentVariables
livenessProbe Deployment container liveness/readiness probe configuration. ProbeSettings
model Reference to the model asset for the endpoint deployment. AssetReferenceBase
properties Property dictionary. Properties can be added, but not removed or altered. OnlineDeploymentProperties
requestSettings Online deployment scoring requests configuration. OnlineRequestSettings
scaleSettings Online deployment scaling configuration. OnlineScaleSettings
endpointComputeType Set the object type K8S
Managed (required)

CodeConfiguration

Name Description Value
codeId ARM resource ID of the code asset. string
scoringScript [Required] The script to execute on startup. eg. "score.py" string (required)

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_]

OnlineDeploymentEnvironmentVariables

Name Description Value
{customized property} string

ProbeSettings

Name Description Value
failureThreshold The number of failures to allow before returning an unhealthy status. int
initialDelay The delay before the first probe in ISO 8601 format. string
period The length of time between probes in ISO 8601 format. string
successThreshold The number of successful probes before returning a healthy status. int
timeout The probe timeout in ISO 8601 format. string

AssetReferenceBase

Name Description Value
referenceType Set the object type DataPath
Id
OutputPath (required)

DataPathAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. 'DataPath' (required)
datastoreId ARM resource ID of the datastore where the asset is located. string
path The path of the file/directory in the datastore. string

IdAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. 'Id' (required)
assetId [Required] ARM resource ID of the asset. string (required)

Constraints:
Pattern = [a-zA-Z0-9_]

OutputPathAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. 'OutputPath' (required)
jobId ARM resource ID of the job. string
path The path of the file/directory in the job output. string

OnlineDeploymentProperties

Name Description Value
{customized property} string

OnlineRequestSettings

Name Description Value
maxConcurrentRequestsPerInstance The number of requests allowed to queue at once for this deployment. int
maxQueueWait The maximum queue wait time in ISO 8601 format. Supports millisecond precision. string
requestTimeout The request timeout in ISO 8601 format. Supports millisecond precision. string

OnlineScaleSettings

Name Description Value
maxInstances Maximum number of instances for this deployment. int
minInstances Minimum number of instances for this deployment. int
scaleType Set the object type Auto
Manual (required)

AutoScaleSettings

Name Description Value
scaleType [Required] Type of deployment scaling algorithm 'Auto' (required)
pollingInterval The polling interval in ISO 8691 format. Only supports duration with precision as low as Seconds. string
targetUtilizationPercentage Target CPU usage for the autoscaler. int

ManualScaleSettings

Name Description Value
scaleType [Required] Type of deployment scaling algorithm 'Manual' (required)
instanceCount Fixed number of instances for this deployment. int

K8SOnlineDeployment

Name Description Value
endpointComputeType [Required] The compute type of the endpoint. 'K8S' (required)
containerResourceRequirements Resource requirements for each container instance within an online deployment. ContainerResourceRequirements

ContainerResourceRequirements

Name Description Value
cpu The minimum amount of CPU cores to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
cpuLimit The maximum amount of CPU cores allowed to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
fpga The number of FPGA PCIE devices exposed to the container. Must be multiple of 2. int
gpu The number of GPU cores in the container. int
memoryInGB The minimum amount of memory (in GB) to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
memoryInGBLimit The maximum amount of memory (in GB) allowed to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int

ManagedOnlineDeployment

Name Description Value
endpointComputeType [Required] The compute type of the endpoint. 'Managed' (required)
instanceType Compute instance type. string
readinessProbe Deployment container liveness/readiness probe configuration. ProbeSettings

ARM template resource definition

The workspaces/onlineEndpoints/deployments resource type can be deployed with operations that target:

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments resource, add the following JSON to your template.

{
  "type": "Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments",
  "apiVersion": "2021-03-01-preview",
  "name": "string",
  "location": "string",
  "tags": {
    "tagName1": "tagValue1",
    "tagName2": "tagValue2"
  },
  "kind": "string",
  "identity": {
    "type": "string",
    "userAssignedIdentities": {
      "{customized property}": {
        "clientId": "string",
        "principalId": "string"
      }
    }
  },
  "properties": {
    "appInsightsEnabled": "bool",
    "codeConfiguration": {
      "codeId": "string",
      "scoringScript": "string"
    },
    "description": "string",
    "environmentId": "string",
    "environmentVariables": {
      "{customized property}": "string"
    },
    "livenessProbe": {
      "failureThreshold": "int",
      "initialDelay": "string",
      "period": "string",
      "successThreshold": "int",
      "timeout": "string"
    },
    "model": {
      "referenceType": "string"
      // For remaining properties, see AssetReferenceBase objects
    },
    "properties": {
      "{customized property}": "string"
    },
    "requestSettings": {
      "maxConcurrentRequestsPerInstance": "int",
      "maxQueueWait": "string",
      "requestTimeout": "string"
    },
    "scaleSettings": {
      "maxInstances": "int",
      "minInstances": "int",
      "scaleType": "string"
      // For remaining properties, see OnlineScaleSettings objects
    },
    "endpointComputeType": "string"
    // For remaining properties, see OnlineDeployment objects
  }
}

OnlineDeployment objects

Set the endpointComputeType property to specify the type of object.

For K8S, use:

  "endpointComputeType": "K8S",
  "containerResourceRequirements": {
    "cpu": "int",
    "cpuLimit": "int",
    "fpga": "int",
    "gpu": "int",
    "memoryInGB": "int",
    "memoryInGBLimit": "int"
  }

For Managed, use:

  "endpointComputeType": "Managed",
  "instanceType": "string",
  "readinessProbe": {
    "failureThreshold": "int",
    "initialDelay": "string",
    "period": "string",
    "successThreshold": "int",
    "timeout": "string"
  }

AssetReferenceBase objects

Set the referenceType property to specify the type of object.

For DataPath, use:

  "referenceType": "DataPath",
  "datastoreId": "string",
  "path": "string"

For Id, use:

  "referenceType": "Id",
  "assetId": "string"

For OutputPath, use:

  "referenceType": "OutputPath",
  "jobId": "string",
  "path": "string"

OnlineScaleSettings objects

Set the scaleType property to specify the type of object.

For Auto, use:

  "scaleType": "Auto",
  "pollingInterval": "string",
  "targetUtilizationPercentage": "int"

For Manual, use:

  "scaleType": "Manual",
  "instanceCount": "int"

Property values

workspaces/onlineEndpoints/deployments

Name Description Value
type The resource type 'Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments'
apiVersion The resource api version '2021-03-01-preview'
name The resource name

See how to set names and types for child resources in JSON ARM templates.
string (required)
location The geo-location where the resource lives string (required)
tags Resource tags. Dictionary of tag names and values. See Tags in templates
kind Metadata used by portal/tooling/etc to render different UX experiences for resources of the same type. string
identity Service identity associated with a resource. ResourceIdentity
properties [Required] Additional attributes of the entity. OnlineDeployment (required)

ResourceIdentity

Name Description Value
type Defines values for a ResourceIdentity's type. 'None'
'SystemAssigned'
'SystemAssigned,UserAssigned'
'UserAssigned'
userAssignedIdentities Dictionary of the user assigned identities, key is ARM resource ID of the UAI. ResourceIdentityUserAssignedIdentities

ResourceIdentityUserAssignedIdentities

Name Description Value
{customized property} UserAssignedIdentityMeta

UserAssignedIdentityMeta

Name Description Value
clientId Aka application ID, a unique identifier generated by Azure AD that is tied to an application and service principal during its initial provisioning. string
principalId The object ID of the service principal object for your managed identity that is used to grant role-based access to an Azure resource. string

OnlineDeployment

Name Description Value
appInsightsEnabled If true, enables Application Insights logging. bool
codeConfiguration Code configuration for the endpoint deployment. CodeConfiguration
description Description of the endpoint deployment. string
environmentId ARM resource ID of the environment specification for the endpoint deployment. string
environmentVariables Environment variables configuration for the deployment. OnlineDeploymentEnvironmentVariables
livenessProbe Deployment container liveness/readiness probe configuration. ProbeSettings
model Reference to the model asset for the endpoint deployment. AssetReferenceBase
properties Property dictionary. Properties can be added, but not removed or altered. OnlineDeploymentProperties
requestSettings Online deployment scoring requests configuration. OnlineRequestSettings
scaleSettings Online deployment scaling configuration. OnlineScaleSettings
endpointComputeType Set the object type K8S
Managed (required)

CodeConfiguration

Name Description Value
codeId ARM resource ID of the code asset. string
scoringScript [Required] The script to execute on startup. eg. "score.py" string (required)

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_]

OnlineDeploymentEnvironmentVariables

Name Description Value
{customized property} string

ProbeSettings

Name Description Value
failureThreshold The number of failures to allow before returning an unhealthy status. int
initialDelay The delay before the first probe in ISO 8601 format. string
period The length of time between probes in ISO 8601 format. string
successThreshold The number of successful probes before returning a healthy status. int
timeout The probe timeout in ISO 8601 format. string

AssetReferenceBase

Name Description Value
referenceType Set the object type DataPath
Id
OutputPath (required)

DataPathAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. 'DataPath' (required)
datastoreId ARM resource ID of the datastore where the asset is located. string
path The path of the file/directory in the datastore. string

IdAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. 'Id' (required)
assetId [Required] ARM resource ID of the asset. string (required)

Constraints:
Pattern = [a-zA-Z0-9_]

OutputPathAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. 'OutputPath' (required)
jobId ARM resource ID of the job. string
path The path of the file/directory in the job output. string

OnlineDeploymentProperties

Name Description Value
{customized property} string

OnlineRequestSettings

Name Description Value
maxConcurrentRequestsPerInstance The number of requests allowed to queue at once for this deployment. int
maxQueueWait The maximum queue wait time in ISO 8601 format. Supports millisecond precision. string
requestTimeout The request timeout in ISO 8601 format. Supports millisecond precision. string

OnlineScaleSettings

Name Description Value
maxInstances Maximum number of instances for this deployment. int
minInstances Minimum number of instances for this deployment. int
scaleType Set the object type Auto
Manual (required)

AutoScaleSettings

Name Description Value
scaleType [Required] Type of deployment scaling algorithm 'Auto' (required)
pollingInterval The polling interval in ISO 8691 format. Only supports duration with precision as low as Seconds. string
targetUtilizationPercentage Target CPU usage for the autoscaler. int

ManualScaleSettings

Name Description Value
scaleType [Required] Type of deployment scaling algorithm 'Manual' (required)
instanceCount Fixed number of instances for this deployment. int

K8SOnlineDeployment

Name Description Value
endpointComputeType [Required] The compute type of the endpoint. 'K8S' (required)
containerResourceRequirements Resource requirements for each container instance within an online deployment. ContainerResourceRequirements

ContainerResourceRequirements

Name Description Value
cpu The minimum amount of CPU cores to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
cpuLimit The maximum amount of CPU cores allowed to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
fpga The number of FPGA PCIE devices exposed to the container. Must be multiple of 2. int
gpu The number of GPU cores in the container. int
memoryInGB The minimum amount of memory (in GB) to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
memoryInGBLimit The maximum amount of memory (in GB) allowed to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int

ManagedOnlineDeployment

Name Description Value
endpointComputeType [Required] The compute type of the endpoint. 'Managed' (required)
instanceType Compute instance type. string
readinessProbe Deployment container liveness/readiness probe configuration. ProbeSettings

Terraform (AzAPI provider) resource definition

The workspaces/onlineEndpoints/deployments resource type can be deployed with operations that target:

  • Resource groups

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments resource, add the following Terraform to your template.

resource "azapi_resource" "symbolicname" {
  type = "Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments@2021-03-01-preview"
  name = "string"
  location = "string"
  parent_id = "string"
  tags = {
    tagName1 = "tagValue1"
    tagName2 = "tagValue2"
  }
  identity {
    type = "string"
    identity_ids = []
  }
  body = jsonencode({
    properties = {
      appInsightsEnabled = bool
      codeConfiguration = {
        codeId = "string"
        scoringScript = "string"
      }
      description = "string"
      environmentId = "string"
      environmentVariables = {
        {customized property} = "string"
      }
      livenessProbe = {
        failureThreshold = int
        initialDelay = "string"
        period = "string"
        successThreshold = int
        timeout = "string"
      }
      model = {
        referenceType = "string"
        // For remaining properties, see AssetReferenceBase objects
      }
      properties = {
        {customized property} = "string"
      }
      requestSettings = {
        maxConcurrentRequestsPerInstance = int
        maxQueueWait = "string"
        requestTimeout = "string"
      }
      scaleSettings = {
        maxInstances = int
        minInstances = int
        scaleType = "string"
        // For remaining properties, see OnlineScaleSettings objects
      }
      endpointComputeType = "string"
      // For remaining properties, see OnlineDeployment objects
    }
    kind = "string"
  })
}

OnlineDeployment objects

Set the endpointComputeType property to specify the type of object.

For K8S, use:

  endpointComputeType = "K8S"
  containerResourceRequirements = {
    cpu = int
    cpuLimit = int
    fpga = int
    gpu = int
    memoryInGB = int
    memoryInGBLimit = int
  }

For Managed, use:

  endpointComputeType = "Managed"
  instanceType = "string"
  readinessProbe = {
    failureThreshold = int
    initialDelay = "string"
    period = "string"
    successThreshold = int
    timeout = "string"
  }

AssetReferenceBase objects

Set the referenceType property to specify the type of object.

For DataPath, use:

  referenceType = "DataPath"
  datastoreId = "string"
  path = "string"

For Id, use:

  referenceType = "Id"
  assetId = "string"

For OutputPath, use:

  referenceType = "OutputPath"
  jobId = "string"
  path = "string"

OnlineScaleSettings objects

Set the scaleType property to specify the type of object.

For Auto, use:

  scaleType = "Auto"
  pollingInterval = "string"
  targetUtilizationPercentage = int

For Manual, use:

  scaleType = "Manual"
  instanceCount = int

Property values

workspaces/onlineEndpoints/deployments

Name Description Value
type The resource type "Microsoft.MachineLearningServices/workspaces/onlineEndpoints/deployments@2021-03-01-preview"
name The resource name string (required)
location The geo-location where the resource lives string (required)
parent_id The ID of the resource that is the parent for this resource. ID for resource of type: onlineEndpoints
tags Resource tags. Dictionary of tag names and values.
kind Metadata used by portal/tooling/etc to render different UX experiences for resources of the same type. string
identity Service identity associated with a resource. ResourceIdentity
properties [Required] Additional attributes of the entity. OnlineDeployment (required)

ResourceIdentity

Name Description Value
type Defines values for a ResourceIdentity's type. "SystemAssigned"
"SystemAssigned,UserAssigned"
"UserAssigned"
identity_ids Dictionary of the user assigned identities, key is ARM resource ID of the UAI. Array of user identity IDs.

ResourceIdentityUserAssignedIdentities

Name Description Value
{customized property} UserAssignedIdentityMeta

UserAssignedIdentityMeta

Name Description Value
clientId Aka application ID, a unique identifier generated by Azure AD that is tied to an application and service principal during its initial provisioning. string
principalId The object ID of the service principal object for your managed identity that is used to grant role-based access to an Azure resource. string

OnlineDeployment

Name Description Value
appInsightsEnabled If true, enables Application Insights logging. bool
codeConfiguration Code configuration for the endpoint deployment. CodeConfiguration
description Description of the endpoint deployment. string
environmentId ARM resource ID of the environment specification for the endpoint deployment. string
environmentVariables Environment variables configuration for the deployment. OnlineDeploymentEnvironmentVariables
livenessProbe Deployment container liveness/readiness probe configuration. ProbeSettings
model Reference to the model asset for the endpoint deployment. AssetReferenceBase
properties Property dictionary. Properties can be added, but not removed or altered. OnlineDeploymentProperties
requestSettings Online deployment scoring requests configuration. OnlineRequestSettings
scaleSettings Online deployment scaling configuration. OnlineScaleSettings
endpointComputeType Set the object type K8S
Managed (required)

CodeConfiguration

Name Description Value
codeId ARM resource ID of the code asset. string
scoringScript [Required] The script to execute on startup. eg. "score.py" string (required)

Constraints:
Min length = 1
Pattern = [a-zA-Z0-9_]

OnlineDeploymentEnvironmentVariables

Name Description Value
{customized property} string

ProbeSettings

Name Description Value
failureThreshold The number of failures to allow before returning an unhealthy status. int
initialDelay The delay before the first probe in ISO 8601 format. string
period The length of time between probes in ISO 8601 format. string
successThreshold The number of successful probes before returning a healthy status. int
timeout The probe timeout in ISO 8601 format. string

AssetReferenceBase

Name Description Value
referenceType Set the object type DataPath
Id
OutputPath (required)

DataPathAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. "DataPath" (required)
datastoreId ARM resource ID of the datastore where the asset is located. string
path The path of the file/directory in the datastore. string

IdAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. "Id" (required)
assetId [Required] ARM resource ID of the asset. string (required)

Constraints:
Pattern = [a-zA-Z0-9_]

OutputPathAssetReference

Name Description Value
referenceType [Required] Specifies the type of asset reference. "OutputPath" (required)
jobId ARM resource ID of the job. string
path The path of the file/directory in the job output. string

OnlineDeploymentProperties

Name Description Value
{customized property} string

OnlineRequestSettings

Name Description Value
maxConcurrentRequestsPerInstance The number of requests allowed to queue at once for this deployment. int
maxQueueWait The maximum queue wait time in ISO 8601 format. Supports millisecond precision. string
requestTimeout The request timeout in ISO 8601 format. Supports millisecond precision. string

OnlineScaleSettings

Name Description Value
maxInstances Maximum number of instances for this deployment. int
minInstances Minimum number of instances for this deployment. int
scaleType Set the object type Auto
Manual (required)

AutoScaleSettings

Name Description Value
scaleType [Required] Type of deployment scaling algorithm "Auto" (required)
pollingInterval The polling interval in ISO 8691 format. Only supports duration with precision as low as Seconds. string
targetUtilizationPercentage Target CPU usage for the autoscaler. int

ManualScaleSettings

Name Description Value
scaleType [Required] Type of deployment scaling algorithm "Manual" (required)
instanceCount Fixed number of instances for this deployment. int

K8SOnlineDeployment

Name Description Value
endpointComputeType [Required] The compute type of the endpoint. "K8S" (required)
containerResourceRequirements Resource requirements for each container instance within an online deployment. ContainerResourceRequirements

ContainerResourceRequirements

Name Description Value
cpu The minimum amount of CPU cores to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
cpuLimit The maximum amount of CPU cores allowed to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
fpga The number of FPGA PCIE devices exposed to the container. Must be multiple of 2. int
gpu The number of GPU cores in the container. int
memoryInGB The minimum amount of memory (in GB) to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int
memoryInGBLimit The maximum amount of memory (in GB) allowed to be used by the container. More info:
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
int

ManagedOnlineDeployment

Name Description Value
endpointComputeType [Required] The compute type of the endpoint. "Managed" (required)
instanceType Compute instance type. string
readinessProbe Deployment container liveness/readiness probe configuration. ProbeSettings