Azure datalake directory partioning naming convention

Question

Azure datalake directory partioning naming convention

Anshal 2,251

Hi friends, this is the subdirectories /containers naming convention for ADLS zones. This naming convention is new to me, what is the reasoning behind this type of naming convention and the advantages of using this kind of naming convention for partitioning raw, staging, and curated zones? User's image

Vinodh247 34,906 Reputation points MVP Volunteer Moderator

2023-04-25T06:57:12.2833333+00:00

Hi Thanks for reaching out to Microsoft Q&A. I suggest you to go through the following documentation which has answers to your question.

https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/cloud-scale-analytics/best-practices/data-lake-zones

Let me know if this helped.

Please Upvote and Accept as answer if the reply was helpful, this will be helpful to other community members.
Anshal 2,251 Reputation points

2023-04-25T08:41:04.8833333+00:00

Thank you. but my question is about the directory naming structure as per my screen shot. I want to know when to use that kind of naming structure (like key value pair) and what are the advantages and benefits.partioning.PNG
Anshal 2,251 Reputation points

2023-04-25T08:41:06.1333333+00:00

Thank you. but my question is about the directory naming structure as per my screen shot. I want to know when to use that kind of naming structure (like key value pair) and what are the advantages and benefits.partioning.PNG
Anshal 2,251 Reputation points

2023-04-25T08:41:32.3366667+00:00

Thank you. but my question is about the directory naming structure as per my screen shot. I want to know when to use that kind of naming structure (like key value pair) and what are the advantages and benefits.
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-04-28T21:49:16.6866667+00:00

@Anshal Just checking in to see if the below information was helpful. If it answers your query, please do click Accept Answer and Yes for "was this answer helpful", as it might be beneficial to other community members reading this thread. If you have any further query, do let us know.

Thank you

Accepted answer

0 additional answers

Your answer

Vinodh247 34,906 Reputation points MVP Volunteer Moderator

2023-04-25T06:57:12.2833333+00:00

Hi Thanks for reaching out to Microsoft Q&A. I suggest you to go through the following documentation which has answers to your question.

https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/cloud-scale-analytics/best-practices/data-lake-zones

Let me know if this helped.

Please Upvote and Accept as answer if the reply was helpful, this will be helpful to other community members.
Anshal 2,251 Reputation points

2023-04-25T08:41:04.8833333+00:00

Thank you. but my question is about the directory naming structure as per my screen shot. I want to know when to use that kind of naming structure (like key value pair) and what are the advantages and benefits.partioning.PNG
Anshal 2,251 Reputation points

2023-04-25T08:41:06.1333333+00:00

Thank you. but my question is about the directory naming structure as per my screen shot. I want to know when to use that kind of naming structure (like key value pair) and what are the advantages and benefits.partioning.PNG
Anshal 2,251 Reputation points

2023-04-25T08:41:32.3366667+00:00

Thank you. but my question is about the directory naming structure as per my screen shot. I want to know when to use that kind of naming structure (like key value pair) and what are the advantages and benefits.
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-04-28T21:49:16.6866667+00:00

@Anshal Just checking in to see if the below information was helpful. If it answers your query, please do click Accept Answer and Yes for "was this answer helpful", as it might be beneficial to other community members reading this thread. If you have any further query, do let us know.

Thank you

Answer 1

Hi @Anshal

Thanks for using Microsoft Q&A forum and posting your query.

In general, the naming convention you mentioned is a way to organize data in Azure Data Lake Storage (ADLS) zones. It uses a directory structure to group data based on different attributes, such as date, time or location. The advantages of this naming conventions are to allow better data organization, filtered searches, security, and automation in the processing. The level of granularity for the date structure is determined by the interval on which the data is uploaded or processed, such as hourly, daily, or even monthly. It helps to improve query performance, makes it easier to manage large volumes of data, and provides flexibility in how data is organized and accessed.

Below are few highlights or benefits of partitioning data in ADLS zones as per the mentioned naming convention:

Improved query performance: Partitioning data based on relevant attributes can significantly improve query performance by reducing the amount of data that needs to be scanned.

Easier data management: Partitioning data into subdirectories based on relevant attributes can make it easier to manage and organize large volumes of data.

Scalability: Partitioning data can help improve scalability by allowing data to be distributed across multiple nodes or clusters.

Flexibility: Partitioning data based on different criteria can provide flexibility in how data is organized and accessed, making it easier to adapt to changing business needs.

In addition, I would recommend going through below blogs for additional information.

FAQs About Organizing a Data Lake
Data Lake Use Cases and Planning Considerations
Best practices for using Azure Data Lake Storage Gen2 - Directory structure - This document has few examples with uses case of this naming conventions.

Hope this info helps.

Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

Share via

Azure datalake directory partioning naming convention

0 additional answers

Your answer