Azure Blob Storage stageBlock: Return contentMd5 as a header

Jacob Nguyen 20 Reputation points
2025-01-03T21:20:43.35+00:00

So I noticed that that the only way to get checksum validation on large files is to supply contentMd5 before upload on a particular block.

I don't have the block checksum before I upload the block. I'm trying to transfer the data from cloud to cloud. I'd rather just calculate checksums while I upload the file to azure, and I don't want to pull it into storage/memory as it won't be manageable since these are large files.

Is there a way to have a feature request where we can just ask the server to calculate checksum (if I send a header indicating for this) and the server returns the block checksum back so that I can use it to validate my upload?

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,030 questions
{count} votes

Accepted answer
  1. Vinod Kumar Reddy Chilupuri 2,225 Reputation points Microsoft Vendor
    2025-01-04T01:01:11.11+00:00

    Hi @Jacob Nguyen

    Welcome to Microsoft Q&A, thanks for posting your query.

    Azure Blob Storage does not natively support a feature where the server calculates and returns the MD5 checksum of a block after it has been uploaded. The typical workflow requires the client to compute the MD5 checksum before uploading the block and then send it as part of the request.

    The best practice is that you always calculate hash of a downloaded blob and keep it as baseline

    Blobs uploaded by PutBlob will have Content-MD5 calculated by Storage service. But Blobs uploaded PutBlock/PutBlockList won’t have it, and client needs to calculate locally and set it to x-ms-blob-content-md5 Blob property explicitly. When client doesn’t do it, it’s empty. Above recommendation is based on such different cases. Reference: https://technet2.github.io/Wiki/blogs/windowsazurestorage/windows-azure-blob-md5-overview.html

     

    For larger files storage does not calculate the MD5 hash of the full blob because each block is written separately. You can work around this by calculating and manually setting the md5 hash when uploading your files. Note the md5 hash is in base64. See the example below for how to upload a blob while calculating and setting the Content-MD5 property:

    az storage blob upload -c test -n md5test -f ./test.img --content-md5 cat test.img | openssl dgst -md5 -binary | base64

    • MD5 hash checks on Azure Blob Storage files
    • Smaller files are not an issue since all the files smaller than 64MB will have Content-MD5 populated by the platform. For larger files, we can either have an azure function that can react on the events or perform a batch operation by spinning up some VMs and calculating MD5.

    “The MD5 hash calculated from the downloaded data does not match the MD5 hash stored in the property of source: For more information see here

     

    Hope the above answer helps! Please let us know do you have any further queries.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.