DatabricksCluster Class

Reference

Defines Databricks cluster information for use in a DatabricksSection.

Initialize.

Inheritance: azureml._base_sdk_common.abstract_run_config_element._AbstractRunConfigElement

DatabricksCluster

Constructor

DatabricksCluster(existing_cluster_id=None, spark_version=None, node_type=None, instance_pool_id=None, num_workers=None, min_workers=None, max_workers=None, spark_env_variables=None, spark_conf=None, init_scripts=None, cluster_log_dbfs_path=None, permit_cluster_restart=None)

Parameters

Name	Description
existing_cluster_id	str A cluster ID of an existing interactive cluster on the Databricks workspace. If this parameter is specified, none of the other parameters should be specified. Default value: None
spark_version	str The version of Spark for the Databricks run cluster. Example: "10.4.x-scala2.12". Default value: None
node_type	str The Azure VM node types for the Databricks run cluster. Example: "Standard_D3_v2". Default value: None
instance_pool_id	str The instance pool ID to which the cluster needs to be attached to. Default value: None
num_workers	int The number of workers for a Databricks run cluster. If this parameter is specified, the `min_workers` and `max_workers` parameters should not be specified. Default value: None
min_workers	int The minimum number of workers for an autoscaled Databricks cluster. Default value: None
max_workers	int The number of workers for an autoscaled Databricks run cluster. Default value: None
spark_env_variables	dict(<xref:{str:str}>) The Spark environment variables for the Databricks run cluster. Default value: None
spark_conf	dict(<xref:{str:str}>) The Spark configuration for the Databricks run cluster. Default value: None
init_scripts	list[str] Deprecated. Databricks announced the init script stored in DBFS will stop work after Dec 1, 2023. To mitigate the issue, please 1) use global init scripts in databricks following https://learn.microsoft.com/azure/databricks/init-scripts/global 2) comment out the line of init_scripts in your AzureML databricks step. Default value: None
cluster_log_dbfs_path	str The DBFS path to where clusters logs need to be delivered. Default value: None
permit_cluster_restart	bool if existing_cluster_id is specified, this parameter tells whether cluster can be restarted on behalf of user. Default value: None

Methods

validate

Validate the specified Databricks cluster details.

Validate checks the types of provided parameters as well as whether the correct combination of parameters is provided. For example, you need to either specify the existing_cluster_id or specify the rest of the cluster parameters. For more information see the constructor parameter definitions.

validate

Validate the specified Databricks cluster details.

validate()

Exceptions

Type	Description
class:azureml.exceptions.UserErrorException

Share via

DatabricksCluster Class

Constructor

Parameters

Methods

validate

Exceptions

Feedback

Additional resources