Training
Module
Automate workloads with Azure Databricks Jobs - Training
Automate workloads with Azure Databricks Jobs
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
This article describes the features available in the Azure Databricks UI to view jobs you have access to, view a history of runs for a job, and view details of job runs. To configure notifications for jobs, see Add email and system notifications for job events.
To learn about using the Databricks CLI to view jobs and run jobs, run the CLI commands databricks jobs list -h
, databricks jobs get -h
, and databricks jobs run-now -h
. To learn about using the Jobs API, see the Jobs API.
If you have access to the system.lakeflow
schema, you can also view and query records of job runs and tasks from across your account. See Jobs system table reference.
To view the list of jobs you have access to, click Workflows in the sidebar. The Jobs tab in the Workflows UI lists information about all available jobs, such as the creator of the job, the trigger for the job, if any, and the result of the last run.
To change the columns displayed in the jobs list, click and select or deselect columns.
You can filter jobs in the Jobs list:
department
and the value finance
, you can search for department
or finance
to find matching jobs. To search by the key and value, enter the key and value separated by a colon; for example, department:finance
.You can also click any column header to sort the list of jobs (either descending or ascending) by that column. When the increased jobs limit feature is enabled, you can sort only by Name
, Job ID
, or Created by
. The default sorting is by Name
in ascending order.
Click to access actions for the job, for example, delete the job.
You can view a list of currently running and recently completed runs for all jobs you have access to, including runs started by external orchestration tools such as Apache Airflow or Azure Data Factory. To view the list of recent job runs:
The matrix view shows a history of runs for the job, including each job task.
The Run total duration row of the matrix displays the run’s total duration and the run’s state. To view details of the run, including the start time, duration, and status, hover over the bar in the Run total duration row.
Each cell in the Tasks row represents a task and the corresponding status of the task. To view details of each task, including the start time, duration, cluster, and status, hover over the cell for that task.
The job run and task run bars are color-coded to indicate the status of the run. Successful runs are green, unsuccessful runs are red, and skipped runs are pink. The height of the individual job run and task run bars visually indicate the run duration.
If you have configured an expected completion time, the matrix view displays a warning when the duration of a run exceeds the configured time.
By default, the runs list view displays:
Queued
, Pending
, Running
, Skipped
, Succeeded
, Failed
, Terminating
, Terminated
, Internal Error
, Timed Out
, Canceled
, Canceling
, or Waiting for Retry
.To change the columns displayed in the runs list view, click and select or deselect columns.
To view details for a job run, click the link for the run in the Start time column in the runs list view. To view details for this job’s most recent successful run, click Go to the latest successful run.
Azure Databricks maintains a history of your job runs for up to 60 days. If you need to preserve job runs, Databricks recommends exporting results before they expire. For more information, see Export job run results.
The job run details page contains job output and links to logs, including information about the success or failure of each task in the job run. You can access job run details from the Runs tab for the job. To view job run details from the Runs tab, click the link for the run in the Start time column in the runs list view. To return to the Runs tab for the job, click the Job ID value.
If the job contains multiple tasks, click a task to view task run details, including:
Click the Job ID value to return to the Runs tab for the job.
Azure Databricks determines whether a job run was successful based on the outcome of the job’s leaf tasks. A leaf task is a task that has no downstream dependencies. A job run can have one of three outcomes:
To view the run history of a task, including successful and unsuccessful runs:
Accessing the run history of a For each
task is the same as a standard Azure Databricks Jobs task. You can click the For each
task node on the Job run details page or the corresponding cell in the matrix view. However, unlike a standard task, the run details for a For each
task are presented as a table of the nested task’s iterations.
To view only failed iterations, click Only failed iterations.
To view the output of an iteration, click the Start time or End time values of the iteration.
You can view a list of currently running and recently completed runs for all jobs in a workspace that you have access to, including runs started by external orchestration tools such as Apache Airflow or Azure Data Factory. To view the list of recent job runs:
The Finished runs count graph displays the number of job runs completed in the last 48 hours. By default, the graph displays the failed, skipped, and successful job runs. You can also filter the graph to show specific run statuses or restrict the graph to a specific time range. The Job runs tab also includes a table of job runs from the last 67 days. By default, the table includes details on failed, skipped, and successful job runs.
Note
The Finished runs count graph is only displayed when you click Owned by me.
You can filter the Finished runs count by run status:
When you click any of the filter buttons, the list of runs in the runs table also updates to show only job runs that match the selected status.
To limit the time range displayed in the Finished runs count graph, click and drag your cursor in the graph to select the time range. The graph and the runs table update to display runs from only the selected time range.
By default, the list of runs in the runs table displays:
Queued
, Pending
, Running
, Skipped
, Succeeded
, Failed
, Terminating
, Terminated
, Internal Error
, Timed Out
, Canceled
, Canceling
, or Waiting for Retry
.To change the columns displayed in the runs list, click and select or deselect columns.
The Top 5 error types table displays a list of the most frequent error types from the selected time range, allowing you to quickly see the most common causes of job issues in your workspace.
To view job run details, click the link in the Start time column for the run. To view job details, click the job name in the Job column.
If Unity Catalog is enabled in your workspace, you can view lineage information for any Unity Catalog tables in your workflow. If lineage information is available for your workflow, you will see a link with a count of upstream and downstream tables in the Job details panel for your job, the Job run details panel for a job run, or the Task run details panel for a task run. Click the link to show the list of tables. Click a table to see detailed information in Catalog Explorer.
You can use the Azure Databricks Jobs UI to view and run jobs deployed by a Databricks Asset Bundle. By default, these jobs are read-only in the Jobs UI. To edit a job deployed by a bundle, change the bundle configuration file and redeploy the job. Applying changes only to the bundle configuration ensures that the bundle source files always capture the current job configuration.
However, if you must make immediate changes to a job, you can disconnect the job from the bundle configuration to enable editing the job settings in the UI. To disconnect the job, click Disconnect from source. In the Disconnect from source dialog, click Disconnect to confirm.
Any changes you make to the job in the UI are not applied to the bundle configuration. To apply changes you make in the UI to the bundle, you must manually update the bundle configuration. To reconnect the job to the bundle configuration, redeploy the job using the bundle.
You can export notebook run results and job run logs for all job types.
You can persist job runs by exporting their results. For notebook job runs, you can export a rendered notebook that can later be imported into your Azure Databricks workspace.
To export notebook run results for a job with a single task:
To export notebook run results for a job with multiple tasks:
You can also export the logs for your job run. You can set up your job to automatically deliver logs to DBFS through the Job API. See the new_cluster.cluster_log_conf
object in the request body passed to the Create a new job operation (POST /jobs/create
) in the Jobs API.
Training
Module
Automate workloads with Azure Databricks Jobs - Training
Automate workloads with Azure Databricks Jobs