Workday Reports connector limitations

This page lists limitations and considerations for ingesting Workday reports using Databricks Lakeflow Connect.

General SaaS connector limitations

The limitations in this section apply to all SaaS connectors in Lakeflow Connect.

  • When you run a scheduled pipeline, alerts don't trigger immediately. Instead, they trigger when the next update runs.
  • When a source table is deleted, the destination table is not automatically deleted. You must delete the destination table manually. This behavior is not consistent with Lakeflow Spark Declarative Pipelines behavior.
  • During source maintenance periods, Databricks might not be able to access your data.
  • If a source table name conflicts with an existing destination table name, the pipeline update fails.
  • Multi-destination pipeline support is API-only.
  • You can optionally rename a table that you ingest. If you rename a table in your pipeline, it becomes an API-only pipeline, and you can no longer edit the pipeline in the UI.
  • If you select a column after a pipeline has already started, the connector does not automatically backfill data for the new column. To ingest historical data, manually run a full refresh on the table.
  • Databricks can't ingest two or more tables with the same name in the same pipeline, even if they come from different source schemas.
  • The source system assumes that the cursor columns are monotonically increasing.
  • The connector ingests raw data without transformations. Use downstream Lakeflow Spark Declarative Pipelines pipelines for transformations.

Authentication

  • Databricks recommends using a Workday integrated system user (ISU), but this is not required.
  • Typically, a refresh token is created on behalf of an ISU. You can choose whether to allow the refresh token to expire:
    • If you set an expiration date, you must edit the connection when you reach that date.
    • If you don't set an expiration date, the refresh token can only expire if your organization reduces the access level of the ISU that's associated with the token.

Pipelines

  • The connector can only ingest reports with less than 2 GB of data or fewer than 1M records. Your organization's Workday API limits might be lower than this.
  • Incremental ingestion is in Beta and requires a primary key. If you configure a primary key for a report, the connector ingests only the rows that have changed since the last pipeline run. If you don't configure a primary key, the connector fully refreshes the report every time the pipeline runs.
  • The connector can't ingest reports with duplicate primary keys.

Incremental ingestion

The following limitations apply to incremental ingestion, which is in Beta.

  • The cursor column must monotonically increase with each new or updated row. Rows whose cursor value is greater than the last ingested cursor value are ingested (inserts and cursor-advancing updates). Deleted rows are never ingested.
  • The cursor column must be a date column. Other cursor types are not supported.
  • Workday report prompts used for incremental ingestion must be inclusive (greater than or equal to / less than or equal to). Using exclusive prompts (greater than / less than) can result in missing data. This is a setting that the report owner selects when they create the report and its prompts in Workday.
  • When you use current_date() in a prompt value, Databricks evaluates it as the UTC date at the time the pipeline starts running. However, Workday interprets the date based on the timezone of your Workday account settings.