blob storage - change path of blobs

Jan Vávra 76 Reputation points
2022-04-06T11:44:16.193+00:00

I've acccidently uploaded 1M of files into container/2019/2019/restofpath and I'd like for each uploaded file change path to container/2019/restofpath. Can it be done, even blob by blob, via some command or api call to change only the path? Something like change metadata of blob ...

If I would have done this by couple of commands az copy, az remove, then I'd pay again for write operations (e.g. 70 e).

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,192 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Michael Taylor 60,161 Reputation points
    2022-04-06T14:41:48.397+00:00

    That's a common misconception of blob storage, there are no paths. The only thing in storage is the storage account and the container. Example, you are using container A and you want to store the file myfile.txt in 2019/restofpath. This is actually a single blob object with the name 2019/restofpath/myfile.txt. "Folders" don't exist in the blog world. But since this is a common need most blob explorers will render a virtual file system when you use those kinds of names in a blob name.

    So, in answer to your question, all your blob objects need to be renamed. You have to do that using a copy. There is no other way.

    As for costing you'll need to decide the best option. If you want to reset the container and start over then you'd pay for uploading all the documents. If you copy you'll be paying for the write as well but they aren't the same. Azure uses egress and ingress terminology and egress happens going into and out of Azure whereas ingress is generally within Azure within the same region. For Blob (based upon the calculator) copy and fresh writes cost the same so I don't know that it would matter.

    0 comments No comments

  2. Jan Vávra 76 Reputation points
    2022-04-07T07:10:42.693+00:00

    Ok, I understand the concept o blob storage. I was courious if there is such possibility. I imagine blob storage as a database application that stores metadata (blob path and others) and somewhere must be a filesystem for binary data. And it has all other features like soft delete, triggers, etc.

    Currently I am missing two features of azcopy

    • change path inside the blob container
    • preserve last modified timestamp when copying to blob

    Does anybody know how the data occupied is counted?
    I have a million of small xml files and I am thinking about zipping them. But If it were counted at filesystem cluster level (I think it is 8kB by default), it is not worth do it.
    By zipping I am undergoing a risk of data loss - one bad byte in a bad place in zip can make the file totally unreadable - and I'll get nothing - I'll pay the same money per file because the space occupied is rounded to 8 kB like the unzipped file.

    And I haven't found any info about cluster size setting for blobs.
    Maybe the binary data aren't stored in a filesystem, but on some block device. I'm really doubting about it. How will be perfomed deletion, concurrent writes of blocks, ...


  3. Jan Vávra 76 Reputation points
    2022-04-07T14:44:02.92+00:00

    The article you have pointed to is saying something about option SingleBlobUploadThresholdInBytes: "the maximum size of a blob in bytes that may be uploaded as a single blob."
    As I understand block blobs, understanding-block-blobs--append-blobs--and-page-blobs, the option SingleBlobUploadThresholdInBytes is only saying how the actually inserted data will be split into chunks (blocks), if they're bigger than SingleBlobUploadThresholdInBytes . Also at the docs, is written: "Maximum blob size (via Put Block List): Approximately 190.7 TiB (4000 MiB X 50,000 blocks)".

    So there is nothing about minimum billed data amount. The SingleBlobUploadThresholdInBytes affects maximum size of the entire blob.
    The use case of stored data is: to store audit analysis data into xml. And there is a small probability that the file will be read, maybe 1:100 000. So read effeciency is not my concern.

    At Cost Management and billing at Azure Portal I can see it was billed 0,79 e ZRS data stored for this first week of April.
    and in the container stats I can see there is used 390 GiB Blob Capacity and 10 M blobs. 390 GiB / 10 M is 39 kiB. These files have such size. So there I am perfectly sure there is no billing unit like filesystem clusters.

    Thanks.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.