Share via


Maintain Workday ingestion pipelines

This page describes ongoing operations for maintaining Workday ingestion pipelines.

General pipeline maintenance

The pipeline maintenance tasks in this section apply to all managed connectors in Lakeflow Connect.

Fully refresh target tables

Fully refreshing the ingestion pipeline clears the data and state of the target tables, then reprocesses all records from the data source. You can fully refresh all tables in the pipeline or select tables to refresh.

  1. In the sidebar of the Azure Databricks workspace, click Pipelines.
  2. Select the pipeline.
  3. On the pipeline details page, click Full refresh all or click Select tables for refresh, select the desired tables, then click Full refresh selection.

Important

The ingestion pipeline update might fail during the Initializing or the Resetting tables phase. Lakeflow Connect will retry the pipeline automatically several times. If the automatic retries are manually interrupted or eventually fail fatally, start the new pipeline update manually with the table refresh selection from before. Failing to do so can result in the target tables being left in an inconsistent state with partial data. If manual retries also fail, create a support ticket.

Change the ingestion pipeline schedule

  1. In the sidebar of the Azure Databricks workspace, click Pipelines.
  2. Select the pipeline, and then click Schedule.

Customize alerts and notifications

Lakeflow Connect automatically sets up notifications for all ingestion pipelines and scheduling jobs. You can customize notifications in the UI or using the Pipelines API.

UI

  1. In the left-hand panel, click Pipelines.
  2. Select your pipeline.
  3. Click Schedule.
  4. If you already have a schedule that you want to receive notifications for: a. Identify the schedule on the list. a. Click the kebab menu, and then click Edit. a. Click More options, and then add your notifications.
  5. If you need a new schedule: a. Click Add schedule. a. Configure your schedule. a. Click More options, and then add your notifications.

API

See Notifications in the PUT /api/2.0/pipelines/{pipeline_id} documentation.

Specify tables to ingest

The Pipelines API provides two methods to specify tables to ingest in the objects field of the ingestion_definition:

  • Table specification: Ingests an individual table from the specified source catalog and schema to the specified destination catalog and schema.
  • Schema specification: Ingests all tables from the specified source catalog and schema into the specified catalog and schema.

If you choose to ingest an entire schema, you should review the limitations on the number of tables per pipeline for your connector.

CLI commands

To edit the pipeline, run the following command:

databricks pipelines update --json "<<pipeline_definition OR json file path>"

To get the pipeline definition, run the following command:

databricks pipelines get "<your_pipeline_id>"

To delete the pipeline, run the following command:

databricks pipelines delete "<your_pipeline_id>"

For more information, you can always run the following command:

databricks pipelines --help
databricks pipelines <create|update|get|delete|...> --help