DatabricksCluster Class

Defines Databricks cluster information for use in a DatabricksSection.

Initialize.

Inheritance
azureml._base_sdk_common.abstract_run_config_element._AbstractRunConfigElement
DatabricksCluster

Constructor

DatabricksCluster(existing_cluster_id=None, spark_version=None, node_type=None, instance_pool_id=None, num_workers=None, min_workers=None, max_workers=None, spark_env_variables=None, spark_conf=None, init_scripts=None, cluster_log_dbfs_path=None, permit_cluster_restart=None)

Parameters

Name Description
existing_cluster_id
str

A cluster ID of an existing interactive cluster on the Databricks workspace. If this parameter is specified, none of the other parameters should be specified.

default value: None
spark_version
str

The version of Spark for the Databricks run cluster. Example: "10.4.x-scala2.12".

default value: None
node_type
str

The Azure VM node types for the Databricks run cluster. Example: "Standard_D3_v2".

default value: None
instance_pool_id
str

The instance pool ID to which the cluster needs to be attached to.

default value: None
num_workers
int

The number of workers for a Databricks run cluster. If this parameter is specified, the min_workers and max_workers parameters should not be specified.

default value: None
min_workers
int

The minimum number of workers for an autoscaled Databricks cluster.

default value: None
max_workers
int

The number of workers for an autoscaled Databricks run cluster.

default value: None
spark_env_variables
dict(<xref:{str:str}>)

The Spark environment variables for the Databricks run cluster.

default value: None
spark_conf
dict(<xref:{str:str}>)

The Spark configuration for the Databricks run cluster.

default value: None
init_scripts

Deprecated. Databricks announced the init script stored in DBFS will stop work after Dec 1, 2023. To mitigate the issue, please 1) use global init scripts in databricks following https://learn.microsoft.com/azure/databricks/init-scripts/global 2) comment out the line of init_scripts in your AzureML databricks step.

default value: None
cluster_log_dbfs_path
str

The DBFS path to where clusters logs need to be delivered.

default value: None
permit_cluster_restart

if existing_cluster_id is specified, this parameter tells whether cluster can be restarted on behalf of user.

default value: None

Methods

validate

Validate the specified Databricks cluster details.

Validate checks the types of provided parameters as well as whether the correct combination of parameters is provided. For example, you need to either specify the existing_cluster_id or specify the rest of the cluster parameters. For more information see the constructor parameter definitions.

validate

Validate the specified Databricks cluster details.

Validate checks the types of provided parameters as well as whether the correct combination of parameters is provided. For example, you need to either specify the existing_cluster_id or specify the rest of the cluster parameters. For more information see the constructor parameter definitions.

validate()

Exceptions

Type Description
class:azureml.exceptions.UserErrorException