Blob and ADLS as landing zone

Samy Abdul 3,376 Reputation points
2022-01-06T11:41:14.48+00:00

Hi all, I am in a bit of a dilemma , in our earlier project we had three layers landing zone, processed and published ,where data use to land in the preprocessing or landing area and upon DQ checks it was processed in to processed zone. But recently I have came up with another project where the data being directly landed in to Blob storage and then after validations being done processed in to ADLS processed area.

I am little confused here about the Blob being used as a prepossessing due to following reasons:

1.Historical data would be maintained in a blob , and when data state grows exponentially in to TBs the Blob might not able to fit in.

2.Maintenance part , well with control access( RBAC) that is possible but it might be additional overhead when compared to having the data in to preprocessing area in ADLS.

I would really appreciate if learned people could chip in here with their valuable inputs on these two architectural perspectives, specifically with an use

case justifying the two approaches. Thanks a lot in advance.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,559 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,486 Reputation points Microsoft Employee Moderator
    2022-01-06T21:30:14.23+00:00

    Hello @Samy Abdul ,
    Thanks for the ask and using Microsoft Q&A platform .
    Just to be clear if I understand the ask . the landing zone in project 1 is ADLS where as the landing zone in project 2 is Blob is that correct ?
    If thats the case and you are conncerned about the

    1.Historical data would be maintained in a blob , and when data state grows exponentially in to TBs the Blob might not able to fit in.

    I am not sure if you have read the article https://learn.microsoft.com/en-us/azure/data-lake-store/data-lake-store-comparison-with-blob-storage as per this you can have 5 PiB of standard storage which can be further extended .

    162906-image.png

    2.Maintenance part , well with control access( RBAC) that is possible but it might be additional overhead when compared to having the data in to preprocessing area in ADLS.

    Well I think ADLS supports "Hierarchical file system" & "RBAC" which has its own advantages but then since ADF does supports blob also so you should be able to work through the same

    Please do let me know how it goes .
    Thanks
    Himanshu

    -------------------------------------------------------------------------------------------------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.