Azure Data Lake Storage - Non-Production environment

Question

Azure Data Lake Storage - Non-Production environment

Gopinath Rajee 656

All,

We use Azure Data Lake Storage GenV2 Hierarchical Namespace Storage Accounts for all our Production needs. We plan to enable GRS on these Storage Accounts. But GRS is LRS(Local) + LRS (Remote).

What if the datacenter at LRS (Local) shuts down for whatever reason (Flooding, Earth Quake ... etc) without any other Regional Failure (which would otherwise warrant a Failover to the Remote Location?) Do I have to now rely on the LRS (Remote) to recreate the Storage Accounts in LRS (Local)? Would in this case, GZRS be a better option?

Thanks,
grajee

PRADEEPCHEEKATLA 90,651 Reputation points Moderator

2022-05-30T05:42:26.687+00:00

Hello @Gopinath Rajee ,

Just checking in to see if the below answer provided by @Luke Murray helped. If this answers your query, do click Accept Answer and Up-Vote for the same. And, if you have any further query do let us know.

1 answer

Your answer

PRADEEPCHEEKATLA 90,651 Reputation points Moderator

2022-05-30T05:42:26.687+00:00

Hello @Gopinath Rajee ,

Just checking in to see if the below answer provided by @Luke Murray helped. If this answers your query, do click Accept Answer and Up-Vote for the same. And, if you have any further query do let us know.

Answer 1

ZRS offers you the best redundancy locally, without failing to a different region.

By default, there are three copies of the data (LRS), which are stored in one of the Azure datacenters (there are usually three that make up a region - more information on what regions have zones are here: https://azure.microsoft.com/en-us/global-infrastructure/geographies/?WT.mc_id=AZ-MVP-5004796#geographies & https://learn.microsoft.com/en-us/azure/availability-zones/az-overview?WT.mc_id=AZ-MVP-5004796 ).

"LRS is the lowest-cost redundancy option and offers the least durability compared to other options. LRS protects your data against server rack and drive failures. However, if a disaster such as fire or flooding occurs within the data center, all replicas of a storage account using LRS may be lost or unrecoverable."

ZRS spreads those three copies to 3 different data centres - which are physically separate.

GRS is then ZRS and then replicated to a single datacenter in the secondary region (LRS).

Its a conversation around risk and type of data, most commonly LRS makes sense for Dev/Test workloads, and ZRS/GRS for Production for additional resiliency (or at least ZRS and GRS for your Backups), but if your dev/test workloads may also be produced for some users, then ZRS makes sense.

Another comment I will make is, to consider all parts of your architecture, its no use having some parts GRS and other parts LRS or ZRS, as the application may not function or failover, you might be spending more than you need to or need to build more redundancy into your application.

Share via

Azure Data Lake Storage - Non-Production environment

1 answer

Your answer