Azure MARS agent VHD metadata failure

Andrew Howard 1 Reputation point
2022-04-22T09:15:31.147+00:00

Hello all, we'd love to use MARS but it is appearing unreliable.
This is a new installation. Latest agent installed.
It is a folder structure of 9TB of files&folders from an NTFS server volume, Windows Server 2016.
It will not complete the metadata VHD generation stage (takes forever to fail then restarts....) .
Error in window is "The current backup operation failed because the number of data transfer failures was more than 1000 Specified files could not be processed. 0x186D9"
The VHD appears to get corrupted after almost 12 hours.
Event viewer logs show event IDs such as 55 & 98 at the time of the fail.
The NTFS log also has an entry like this Event ID 305:
NTFS failed to mount the volume.
Event 305
Error: {Device Offline}
The printer has been taken offline.
Volume GUID: 3484a289-6bce-4bff-ae7c-0b3c325b0e37
Volume Name: Volume3484a289-6bce-4bff-ae7c-0b3c325b0e37
The volume is recognized by NTFS but it is corrupted that NTFS could not mount it. Run CHKDSK /F to fix any errors on this volume, and then try accessing it.

This has happened 3 times.
I have been into the logs in the agents' \temp folder.
Entries such as the below appear frequently.
MetadataVhdWriter.cpp(337) [0000022B70C0B090] 370EAC9B-DAFD-47B9-BE82-9887E87356DD WARNING Failed: Hr: = [0x80070570] : Encountered Failure: : lVal : hr
190C 2C08 04/22 02:24:04.867 32 VhdMgmt.cpp(5221) [0000022B69ADBD90] 370EAC9B-DAFD-47B9-BE82-9887E87356DD WARNING Failed: Hr: = [0x80070570] Failed to get free volume space for Metadata vhd volume path = [\Volume3484a289-6bce-4bff-ae7c-0b3c325b0e37]

The job fails then retries and remounts the VHD approx 15 mins later but seems to be doing the entire VHD generation again.....

c:\hh>err_6.4.5 0x80070003
or hex 0x80070003 / decimal -2147024893
COR_E_DIRECTORYNOTFOUND corerror.h
he specified path couldn't be found.
he system cannot find the path specified.
matches found for "0x80070003"

c:\hh>err_6.4.5 0x80070570
ERROR_FILE_CORRUPT winerror.h
The file or directory is corrupted and unreadable.
1 matches found for "0x80070570"

The errors indicate a problem with the destination VHD (I think anyway) rather than the source and the source files are accessible and readable and do not seem corrupt in any way.

Azure Backup
Azure Backup
An Azure backup service that provides built-in management at scale.
1,134 questions
{count} votes

2 answers

Sort by: Most helpful
  1. SadiqhAhmed-MSFT 38,401 Reputation points Microsoft Employee
    2022-04-22T19:05:46.377+00:00

    @Andrew Howard Thank you for contacting us!

    From the error details I can see that you are using MARS agent backup to backup a folder structure of 9TB and erroring out. This error code occurs when there is low disk space on drive C:.

    To confirm the space need to the scratch folder, please check the following link - https://learn.microsoft.com/en-us/azure/backup/backup-azure-file-folder-backup-faq#what-s-the-minimum-size-requirement-for-the-cache-folder-

    Another option is to relocate the scratch folder to another location. To change it, please see the following information: Change Scratch: https://learn.microsoft.com/en-us/azure/backup/backup-azure-file-folder-backup-faq#how-do-i-change-the-cache-location-for-the-mars-agent-

    In your case, I also noticed the folder size is beyond the supported backup size limits - Refer to support matrix URL - https://learn.microsoft.com/en-us/azure/backup/backup-support-matrix-mars-agent#backup-limits

    Recommendation: Please backup the data in smaller churns to avoid failures.

    Hope this helps!

    Update:

    As the entire folder backup is 9TB (far less than the 54TB max that is specified) am I correct in thinking that there were too many files in the entire job from the perspective of the root folder level and that specifying the job using subfolders instead will overcome this?

    Correct!

    ----------------------------------------------------------------------------------------------------------------------

    If the response helped, do "Accept Answer" and up-vote it

    1 person found this answer helpful.

  2. Andrew Howard 1 Reputation point
    2022-05-05T09:27:26.38+00:00

    Hello. In way of an update I have noticed that at 01:30 when the job was scheduled to run on 4th May that it has basically started again. No errors are reported in the MARS GUI.
    I had repeated the same steps as previous days and added a few extra folders to the job in order to build it up. However we seem to have gone from over 3TB stored back to the beginning and the VHD recreation process began and we're current at 1.3TB uploaded rather than the 4+TB that I expected to see. What is happening? Is it the weekly retention (set at 7 days) doing this?

    199127-image.png

    0 comments No comments