Cannot Create a Synapse Serverless View for DELTA if the delta log does not exist.

Toby Riley 5 Reputation points
2023-07-06T15:48:12.4933333+00:00

You can create Synapse Serverless View for CSV even if the data does not exist. However, you cannot create a Synapse Serverless View for DELTA if the delta log does not exist.

This causes issues when deploying databases to clean environments where the data does not exist.

You should be able to create the view, and if you query the view and the delta log is not there, it should throw an error.

Any suggestions on how to get around this or force create the view?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,297 questions
{count} vote

2 answers

Sort by: Most helpful
  1. AnnuKumari-MSFT 34,451 Reputation points Microsoft Employee
    2023-07-07T08:02:56.77+00:00

    Hi Toby Riley ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    I understand that you are facing an issue while creating a Synapse Serverless View for DELTA if the delta log does not exist. You mentioned that you are able to create a Synapse Serverless View for CSV even if the data does not exist. However, you cannot create a Synapse Serverless View for DELTA if the delta log does not exist. Please let me know if my understanding about your query is incorrect.

    To address this issue, you can use the following approach:

    When you create a Synapse Serverless View for CSV, it does not require any metadata or schema information to be present in the file. However, when you create a Synapse Serverless View for DELTA, it requires the delta log to be present to read the schema information.

    If the delta log does not exist, you can create an empty delta log using the delta command-line tool. The delta tool is a command-line interface for working with Delta Lake tables and files. You can use the delta init command to create an empty delta log for a table.

    1.Kindly try the following command to create an empty delta log for a table:

    delta init <table-path>
    

    Replace <table-path> with the path to the table you want to create an empty delta log for.

    1. Once you have created the empty delta log, you can create the Synapse Serverless View for DELTA.
    2. Additionally, you can use Partitioned views and result set caching to improve query performance.

    Hope it helps. Thankyou


  2. ZEYAD ABDULLA AGA 0 Reputation points
    2023-07-11T07:04:17.2433333+00:00

    Hi Toby Riley ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    I understand that you are facing an issue while creating a Synapse Serverless View for DELTA if the delta log does not exist. You mentioned that you are able to create a Synapse Serverless View for CSV even if the data does not exist. However, you cannot create a Synapse Serverless View for DELTA if the delta log does not exist. Please let me know if my understanding about your query is incorrect.

    To address this issue, you can use the following approach:

    When you create a Synapse Serverless View for CSV, it does not require any metadata or schema information to be present in the file. However, when you create a Synapse Serverless View for DELTA, it requires the delta log to be present to read the schema information.

    If the delta log does not exist, you can create an empty delta log using the delta command-line tool. The delta tool is a command-line interface for working with Delta Lake tables and files. You can use the delta init command to create an empty delta log for a table.

    1.Kindly try the following command to create an empty delta log for a table:

    Copy

    delta init <table-path>
    

    Replace <table-path> with the path to the table you want to create an empty delta log for.

    1. Once you have created the empty delta log, you can create the Synapse Serverless View for DELTA.
    2. Additionally, you can use Partitioned views and result set caching to improve query performance.

    Hope it helps. Thankyou


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.