How to achieve appending records to CETAS External tables in Sql Server MI?

Lonely Rogue 30 Reputation points
2024-10-15T10:27:09.59+00:00

Hello !

Basing explanation on https://techcommunity.microsoft.com/t5/azure-sql-blog/export-capability-cetas-for-data-virtualization-in-azure-sql/bc-p/4270848#M1972 : Data Virtualisation/Archival of Sql Server through Create External Table As Select (CETAS) functionality, we have completed a POC on basic setup but stuck at solving below use cases.

Any help is thanked.

1) How can we Append records into CETAS table since DML operations are disallowed. ?

 

Since my work is about Archival of sql server tables, one way is to - create CETAS for each archival year like Cetas_2023_CustTab, Cetas_2024_CustTab and then UNION these in a View for Application to use.

This all have to be executed periodically through dynamic code and also rebuild the final view to include new CETASes.

 

2) I used Parquet file as CETAS format and found no easy method to merge the files programmatically. Am thinking to experiment with JSON now. Since I could write simple Powershell to merge 2 JSONs.

Any feedback?

 

-- LR

Azure SQL Database
SQL Server | Other
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 22,031 Reputation points Volunteer Moderator
    2024-10-15T15:08:29.47+00:00

    Hello Lonely Rogue,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would need to achieve appending records to CETAS External tables in Sql Server MI.

    In your scenario use case, and regarding your questions I assumed you have large-scale archival operations, and for efficiency and maintainability these are best practices tips:

    How can we Append records into CETAS table since DML operations are disallowed?

    For appending records, your approach of using CETAS for each year with a UNION view is correct but consider automating the view updates to simplify future processes. https://techcommunity.microsoft.com/t5/azure-sql-blog/export-capability-cetas-for-data-virtualization-in-azure-sql/bc-p/4270848#M1972

    I used Parquet file as CETAS format and found no easy method to merge the files programmatically. Am thinking to experiment with JSON now. Since I could write simple PowerShell to merge 2 JSONs?

    For merging Parquet files, rather than moving to JSON, leverage tools like Azure Data Factory or Spark to manage and merge Parquet files programmatically. https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-performance-synapse and https://learn.microsoft.com/en-us/azure/data-factory/transform-data

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.