Schema in ADLS

azure_learner 920 Reputation points
2025-08-18T15:19:02.0633333+00:00

Hi friends, we are developing a medallion architecture as discussed in below thread:

https://learn.microsoft.com/en-us/answers/questions/2237931/azure-datalake-and-consistent-data

We are already in the silver layer where data standardization, merging (for few datasets) has been done. We are implementing Databricks Auto Loader for schema drift and capture metadata changes etc.

Please let me know in this scenario whether we also need to implement schema enforcement at the silver layer or since we already implement Auto Loader no need to explicitly implement schema enforcement.

Also, if we do have to implement schema enforcement what is the best practice to enforce whether it is silver layer or gold layer? Thank you in advance for your help.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
0 comments No comments
{count} votes

Answer accepted by question author
  1. Vinodh247 40,041 Reputation points MVP Volunteer Moderator
    2025-08-18T16:15:29.69+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    Yes, you should still implement schema enforcement at the Silver layer even if you are using Auto Loader. Auto Loader helps at ingestion, but Silver is where you assert data contracts for your organization.

    1. Auto Loader already helps, but it is not enforcement by itself

    Databricks Auto Loader does schema evolution (handle new columns, column renames, type widening, etc.) and schema drift detection.

    However, that is more about ingestion flexibility. It does not guarantee that your downstream layers (Silver/Gold) remain in a known, trusted shape.

    1. Silver layer’s responsibility

    Silver is your curated, standardized layer. By the time data reaches Silver, you want it to follow business-approved, stable schemas.

    Hence, Silver should enforce schema. Any schema drift captured by Auto Loader in Bronze should be reviewed, mapped, and only then promoted to Silver.

    1. How to enforce

    Use Delta Lake’s mergeSchema = false (default) and explicitly define schemas for Silver tables.

    • For incoming changes, build governance around it:

    Detect drift in Bronze with Auto Loader.

    Review it (maybe log to a “schema change tracking” table).

    • Decide if it is valid -> update Silver schema and ETL.
    • Otherwise -> reject/quarantine.
    1. Gold layer

    Gold is consumption-oriented (BI models, ML features, marts). By then, schema enforcement is implicit because Silver has already enforced a clean schema.

    In Gold, you might only need light enforcement (ensuring certain business KPIs exist, data types align with BI/ML tools, etc.).

    1. Best practice (short version)
    • Bronze: Accept anything > Auto Loader handles drift.

    Silver: Enforce schema explicitly. No blind schema evolution. This is where you “lock in” the trusted data contracts.

    • Gold: Consume with confidence -> optional enforcement, more about semantic consistency than technical schema.

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.