Medallion architecture, how to update Gold tables?

Mohammad Saber 591 Reputation points
2023-03-07T12:30:20.6433333+00:00

Assume that I have a data source that is ingested to a few bronze tables, and transformed to a silver table. Ans next, a gold table is created by aggregating the silver table.

If new records arrive in the data source, bronze and silver tables are updated by appending new records. Since the gold table contains aggregated values, using "append" is meaningless.

I'd like to know which approach is recommended to update gold tables in case of having a large dataset:

  1. Drop the current gold table, and re-create it
  2. Overwrite the gold table

The first option seems to be slower if we deal with a large dataset. However, I'd like to know if there is any risk with option 2 (e.g. if the table is not overwritten correctly).

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,080 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 85,586 Reputation points Microsoft Employee
    2023-03-08T06:06:28.33+00:00

    Hello @Mohammad Saber ,

    Thanks for the question and using MS Q&A platform.

    The recommended approach to update the gold table depends on the specific requirements of your use case.

    If you have a large dataset and the cost of recreating the gold table is high, you can use the overwrite option to update the gold table. This option is faster than recreating the table, but it requires that you have a mechanism to ensure that the overwrite operation is atomic and consistent.

    If you have a small dataset, you can use the overwrite option without any risk. However, if you have a large dataset, there is a risk that the overwrite operation may fail or be inconsistent. To mitigate this risk, you can use the following approach:

    1. Create a new gold table with the updated data.
    2. Rename the new gold table to replace the old gold table.

    This approach ensures that the gold table is always in a consistent state, even if the overwrite operation fails.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.