@Veli-Jussi Raitila This decision is based purely on the Your own requirements and objectives – separation due to different business units, different geo-political boundaries, different governance requirements on different zones. However, there is a management overhead and a minor loss of discoverability associated with such a federated approach and so we also see you are going for a single, centralized model. This is especially true if the customer has a long history with Hadoop/HDFS on-premises.
Here is some more guidance on the matter, albeit still not being categorical about when one approach is superior to the other: https://azure.github.io/Storage/docs/analytics/hitchhikers-guide-to-the-datalake/#do-i-want-a-centralized-or-a-federated-data-lake-implementation
ADLS accounts are limited in the same manner as normal storage accounts. However, we have a high degree of flexibility in where those limits are set and generally reflect the customer’s desire as it impacts our capacity planning. However, for very large data lake installations (eg. > 50PB) or data lakes that are to be subject to a very specific IO pattern that unavoidably leads to large IOPS loads (generally outside the range of analytics frameworks – Spark, Hadoop, etc. are capable of generating – think > 100,000 IOPS),
Please let us know if you have any further queries. I’m happy to assist you further.
----------
Please do not forget to and “up-vote” wherever the information provided helps you**, this can be beneficial to other community members.**