Manage very large Blob Storage

Matteo Rigoni 2 Reputation points
2023-10-10T09:47:39.55+00:00

Hi, i've a very large blob storage, with many small files (few kb), with a total size of 15TB.

Every 6 months, i download older blobs according certain rules and then i upload them into other blob storage account, so i've somthing like this:

  • main blob storage (hot) - 15 TB
  • blob storage 2022 2nd semester (cool) - 8 TB
  • blob storage 2021 1nd semester (cool) - 4 TB
  • ...

In my application older data are rarely accessed, to tipically i search data in main blob storage, if not found search in second, and so on.

The problem is that the download/upload operation is really expensive between 3000 and 4000 $, also because i have turned on windows defender and geo-redundancy.

How can i manage this situation? Is really necessary move older data to another storage to keep the principal small? Or i can also reach 100-200 TB without any problem? The growth is about 10-15 TB for year.

Thanks for any advice.

Azure Cost Management
Azure Cost Management
A Microsoft offering that enables tracking of cloud usage and expenditures for Azure and other cloud providers.
2,137 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,493 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Azar 20,585 Reputation points
    2023-10-10T13:19:54.1533333+00:00

    Hi @Matteo Rigoni

    For Managing large amounts of data in Azure Blob Storage can be cost-effective and efficient if you plan your storage strategy correctly.

    whether to move older data to a separate storage account or keep it in the same account depends on factors like your budget, data access patterns, and retention policies. With proper planning, cost optimization, and the use of Azure's storage tiers and features, you can effectively manage your large blob storage while keeping costs under control, I have also included the documentation links below kindly go through them for more info

    Azure Blob Storage Documentation

    Set and manage retention policies for Azure Blob Storage

    Analyze your costs with Cost Management and Billing

    If you find this answer useful kindly accept the answer thanks much.

    0 comments No comments

  2. Pramod Valavala 20,591 Reputation points Microsoft Employee
    2023-10-11T00:00:49.09+00:00

    @Matteo Rigoni I assume the reason for separate storage accounts is to optimize the search capabilities.

    The default limits for storage accounts are extremely high (5PiB by default), so you would need at-least 50 years' worth of data to fill one up and even then, you could request an increase (and would increase as technology gets better each year).

    Usually, the best practice would be to name your blobs accordingly to better filter them using the prefix parameter in the underlying REST API calls. So, something like <year>/<categor-1>/<category-2>/<more-paths>/<filename>.

    But if something like this isn't acceptable, you could always create a container for each year (or even more granular as required) since there is no limit to the number of containers.

    If you already have a way to distinguish blobs by year, perhaps by using tags, then you could simply use access tiers within the same storage account at the blob level. You can use lifecycle management policies to automate this, and you could transition blobs if required instead of the 6 months cycle that you have today.