Is it necessary to validate data copy or upload to Azure Block Blob Storage with AzCopy or Azure Storage Explorer?

Paul Grimwood 40 Reputation points
2023-04-27T13:26:29.59+00:00

I am considering using an Azure General Purpose V2 account to store Next Generation Sequencing (NGS) fastq.gz files as block blobs. The files are between 1 and 12GB and are on a linux VM. I was originally considering AzCopy as a client tool to manage the copy to the Azure store but our collaborators are already using Azure Storage Explorer so there are reasons to go down that route. The question however is more one about data transfer validation: How can we be certain that the local file when written over multiple blocks in the blob is an exact copy of the local file? I have read that the only way to validate that is to download the blob and compare MD5 checksums on the original file and downloaded copy of that file, or to copy it to an Azure VM, and check the MD5 of the remote copy. When we move files around our infrastructure and I want to guarantee integrity I use rsync as it uses checksums integrally to verify the transfer. Does the AzCopy copy process and the Azure Storage Explorer upload process utilise checksums integrally or possibly some other process to guarantee that the blob created accurately reflects the local file and that nothing was lost or modified during the transfer?

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,150 questions
{count} votes

Accepted answer
  1. Sumarigo-MSFT 46,126 Reputation points Microsoft Employee
    2023-04-29T05:57:00.2766667+00:00

    @Paul Grimwood Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    Azure Storage Explorer already uses AzCopy as a backend

    Downloading to a null-type destination (e.g. /dev/null on Unix-based systems; NUL on Windows) with --check-md5​ will achieve the desired effect re: validation.

    We have work upcoming in the eventual future (no timeline yet) to be able to thoroughly validate a blob on upload; but that is a ways out.

    Please let us know if you have any further queries. I’m happy to assist you further.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.