DatabricksCompute Class

Manages a Databricks compute target in Azure Machine Learning.

Azure Databricks is an Apache Spark-based environment in the Azure cloud. It can be used as a compute target with an Azure Machine Learning pipeline. For more information, see What are compute targets in Azure Machine Learning?

Class ComputeTarget constructor.

Retrieve a cloud representation of a Compute object associated with the provided workspace. Returns an instance of a child class corresponding to the specific type of the retrieved Compute object.

Inheritance
DatabricksCompute

Constructor

DatabricksCompute(workspace, name)

Parameters

workspace
Workspace
Required

The workspace object containing the DatabricksCompute object to retrieve.

name
str
Required

The name of the of the DatabricksCompute object to retrieve.

workspace
Workspace
Required

The workspace object containing the Compute object to retrieve.

name
str
Required

The name of the of the Compute object to retrieve.

Remarks

The following example shows how to attach Azure Databricks as a compute target.


   # Replace with your account info before running.

   db_compute_name=os.getenv("DATABRICKS_COMPUTE_NAME", "<my-databricks-compute-name>") # Databricks compute name
   db_resource_group=os.getenv("DATABRICKS_RESOURCE_GROUP", "<my-db-resource-group>") # Databricks resource group
   db_workspace_name=os.getenv("DATABRICKS_WORKSPACE_NAME", "<my-db-workspace-name>") # Databricks workspace name
   db_access_token=os.getenv("DATABRICKS_ACCESS_TOKEN", "<my-access-token>") # Databricks access token

   try:
       databricks_compute = DatabricksCompute(workspace=ws, name=db_compute_name)
       print('Compute target {} already exists'.format(db_compute_name))
   except ComputeTargetException:
       print('Compute not found, will use below parameters to attach new one')
       print('db_compute_name {}'.format(db_compute_name))
       print('db_resource_group {}'.format(db_resource_group))
       print('db_workspace_name {}'.format(db_workspace_name))
       print('db_access_token {}'.format(db_access_token))

       config = DatabricksCompute.attach_configuration(
           resource_group = db_resource_group,
           workspace_name = db_workspace_name,
           access_token= db_access_token)
       databricks_compute=ComputeTarget.attach(ws, db_compute_name, config)
       databricks_compute.wait_for_completion(True)

Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb

Methods

attach

DEPRECATED. Use the attach_configuration method instead.

Associate an existing Databricks compute resource with the provided workspace.

attach_configuration

Create a configuration object for attaching a Databricks compute target.

delete

Delete is not supported for a DatabricksCompute object. Use detach instead.

deserialize

Convert a JSON object into a DatabricksCompute object.

detach

Detaches the Databricks object from its associated workspace.

Underlying cloud objects are not deleted, only the association is removed.

get_credentials

Retrieve the credentials for the Databricks target.

refresh_state

Perform an in-place update of the properties of the object.

This method updates the properties based on the current state of the corresponding cloud object. This is primarily used for manual polling of compute state.

serialize

Convert this DatabricksCompute object into a JSON serialized dictionary.

attach

DEPRECATED. Use the attach_configuration method instead.

Associate an existing Databricks compute resource with the provided workspace.

static attach(workspace, name, resource_id, access_token)

Parameters

workspace
Workspace
Required

The workspace object to associate the compute resource with.

name
str
Required

The name to associate with the compute resource inside the provided workspace. Does not have to match the name of the compute resource to be attached.

resource_id
str
Required

The Azure resource ID for the compute resource being attached.

access_token
str
Required

The access token for the resource being attached.

Returns

A DatabricksCompute object representation of the compute object.

Return type

Exceptions

attach_configuration

Create a configuration object for attaching a Databricks compute target.

static attach_configuration(resource_group=None, workspace_name=None, resource_id=None, access_token='')

Parameters

resource_group
str
default value: None

The name of the resource group in which the Databricks is located.

workspace_name
str
default value: None

The Databricks workspace name.

resource_id
str
default value: None

The Azure resource ID for the compute resource being attached.

access_token
str
Required

The access token for the compute resource being attached.

Returns

A configuration object to be used when attaching a Compute object.

Return type

Exceptions

delete

Delete is not supported for a DatabricksCompute object. Use detach instead.

delete()

Exceptions

deserialize

Convert a JSON object into a DatabricksCompute object.

static deserialize(workspace, object_dict)

Parameters

workspace
Workspace
Required

The workspace object the DatabricksCompute object is associated with.

object_dict
dict
Required

A JSON object to convert to a DatabricksCompute object.

Returns

The DatabricksCompute representation of the provided JSON object.

Return type

Exceptions

Remarks

Raises a ComputeTargetException if the provided workspace is not the workspace the Compute is associated with.

detach

Detaches the Databricks object from its associated workspace.

Underlying cloud objects are not deleted, only the association is removed.

detach()

Exceptions

get_credentials

Retrieve the credentials for the Databricks target.

get_credentials()

Returns

The credentials for the Databricks target.

Return type

Exceptions

refresh_state

Perform an in-place update of the properties of the object.

This method updates the properties based on the current state of the corresponding cloud object. This is primarily used for manual polling of compute state.

refresh_state()

Exceptions

serialize

Convert this DatabricksCompute object into a JSON serialized dictionary.

serialize()

Returns

The JSON representation of this DatabricksCompute object.

Return type

Exceptions