GDPR Handling In ADLS Gen2

Question

GDPR Handling In ADLS Gen2

Relay 200

I am creating a Centralised Data LAKEHOUSE as shown in Diagram.

User's image

I have created a Second ADLS Gen2 so it easily connect with Databricks.

I am seeking your valuable help in designing ADLS Gen 2 for silver layer.

Do This approach is good.
Do I need to always duplicate data to ADLS Gen2 from SQL, or any Caching mechanism is available in Azure.
How to improve cost efficiency.
How ADLS Gen2 handle PII information.
How we can assure there is no duplicacy in ADLS Gen2

Please help with your expertise thought.

Thanks.

Accepted answer

0 additional answers

Your answer

Answer 1

Nandan Hegde 36,151 MVP Volunteer Moderator

Duplicating the data across Azure SQL database and ADLS Gen2 is a bad design as per my opinion.

What is the significance of loading the data into Azure SQL database?

You can create another container within the existing ADLS Gen 2 itself directly for the silver layer rather than introducing an Azure SQL Database as a bridge between.

Also any specific reason ADLS Gen2 other than just connectivity to Databricks?

And when you say handling PII data in ADLS (what do you mean)?

In Azure SQL database, there are multiple mechanisms like Dynamic data masking or column encryptions to handle PII data but we do not have that flexibility in ADLS Gen 2

Relay 200 Reputation points

2025-06-18T14:39:25.55+00:00

@Nandan Hegde : Hi Nandan,

Please find the more information I highlighted. Please provide your suggestion.

Duplicating the data across Azure SQL database and ADLS Gen2 is a bad design as per my opinion. -- kindly suggest any better option

What is the significance of loading the data into Azure SQL database? - The clean data was called by service layer through API.

You can create another container within the existing ADLS Gen 2 itself directly for the silver layer rather than introducing an Azure SQL Database as a bridge between.

The Bronze ADLS Gen2 is a lake, source of all raw unprocessed data.

Any specific reason ADLS Gen2 other than just connectivity to Databricks?

What all option we can have in Azure for this.

Thanks a lot
Nandan Hegde 36,151 Reputation points MVP Volunteer Moderator

2025-06-18T14:48:06.9166667+00:00

You can connect an Azure SQL database from databricks, so you can have the Azure SQL database itself as your silver layer and no need to duplicate it back in ADLS Gen 2 if you are using ADLS Gen 2 just for the reason of databricks mounting.

Also you can have multiple containers within ADLS Gen2 acting as your diff layers like 1 container which would be your bronze layer and then another container for silver layers etc thereby bifurcating the access

Share via

GDPR Handling In ADLS Gen2

0 additional answers

Your answer