Azure Blob Storage upload large files

net 6 newbie 121 Reputation points
2022-12-11T07:04:56.087+00:00

I am writing a C# Console Application that uploads Local Files to Azure Blob Storage.

While Uploading Large files(say .pdf files here) I am doing two things

Breaking the File into blocks.
Using SemphoreSlim while staging blocks on azure Cloud storage.
Is there any way to test the blocks are in the correct sequence as expected and pdf is not corrupted after committing the blocks?

What I tried is reading the file again after upload and validating the signature, but in cases where some pages or some portion of pdf is missing how to detect this kind of issues

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,200 questions
0 comments No comments
{count} votes

Accepted answer
  1. Sumarigo-MSFT 47,471 Reputation points Microsoft Employee Moderator
    2022-12-12T06:11:49.017+00:00

    @net 6 newbie Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    Block blobs include features that help you manage large files over networks. With a block blob, you can upload multiple blocks in parallel to decrease upload time. Each block can include an MD5 hash to verify the transfer, so you can track upload progress and re-send blocks as needed. You can upload blocks in any order, and determine their sequence in the final block list commitment step. You can also upload a new block to replace an existing uncommitted block of the same block ID. You have one week to commit blocks to a blob before they are discarded. All uncommitted blocks are also discarded when a block list commitment operation occurs but does not include them.

    Learn more here

    Additional information: md5 hash calculation for large files

    Calculate & Validate MD5 hashes on Azure blob storage files with PowerShell

    I would also recommend to use ****azcopy** tool to upload files:** put-MD5 check calculates MD5 on the source not destination. It puts the value in HTTP MD5 header. When you download, AzCopy can recalculate the MD5 and verify that it is equal to the one you uploaded.

    Data integrity and validation: https://github.com/Azure/azure-storage-azcopy/wiki/Data-integrity-and-validation

    az copy --put-md5. MD5 hash is calculated and stored automatically

    If the issue still persists, please share your code without PI information, I’m happy to assist you further.


    Please do not forget to 269571-accept-answer.png and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.