I'm currently experiencing problems replicating two Hyper-V guest machines to Azure using Site Recovery, so far my best guess is that I'm having problems with a certificate of some sort but I haven't been able to pin down anything specific.
The site's heartbeat was all ok, it was just the replication to the vault that was failing and both the Azure Site Recovery Provider and Microsoft Azure Recovery Services Agent were up to date.
Originally I had the following replication health error in the Azure portal for the replicated VMs:
Error ID
68501
Error Message
The virtual machine couldn't be replicated.
Provider error
Replication for virtual machine '<guest_name>' is in error. Fix the error(s) and resume replication.
Possible causes
Replication errors occurred for virtual machine '<guest_name>' in cloud/site '<site_name>' because of issues with connectivity to Azure storage.
Recommendation
1. If Identity is enabled on the Recovery Services Vault, please make sure the log/target Azure storage account
has the necessary permissions to access the storage account.
a) Go to your Storage account -> Access Control (IAM).
b) Add the below role-assignments (for ARM based storage account) to the Recovery services vault.
1) "Contributor" and,
2) "Storage Blob Data Contributor" for Standard storage or "Storage Blob Data Owner" for Premium storage
2. Fix any issues in the Event Viewer logs (Applications and Service Logs - MicrosoftAzureRecoveryServices) on the Hyper-V host server and resume the replication.
Attempting to resume the replication on the Hyper-V server produced the following error in the Hyper-V-VMMS\Admin event log:
Hyper-V could not replicate changes for virtual machine '<guest_name>': The system cannot find the file specified. (0x80070002). (Virtual machine ID <guid>)
A bit of searching seems to indicate that this could be a certificate error, so I kicked off a certificate renewal in the Site Recovery console which succeeded, after that I gave it 24 hours but I was still unable to resume the replication and the same errors were logged.
My next step was to try reinstalling the Azure Site Recovery Provider and re-register it with a fresh credential file from Azure. This failed with the error "The ASR cannot be registered due to an internal error. Run Setup again to register the server."
Looking at the log files I see the following:
09:39:19:Registration Starting
09:39:19:Initializing AAD based registration client.
09:39:19:Service dra status: True
09:39:19:Stopping DR - dra service.
09:39:19:Get resource token
09:39:19:Initializing AAD Library.
09:39:36:Got exception Caught exception while acquiring AAD token:
CorrelationId: '80a9c2fd-aaa6-4e3e-be73-e9312e31e492'
Message: 'One or more errors occurred.'.. Retry count 1.
09:39:41:Get resource token
09:39:41:Initializing AAD Library.
09:39:56:Got exception Caught exception while acquiring AAD token:
CorrelationId: '2d54b204-9505-4ba5-aa34-ec0a1accd1b4'
Message: 'One or more errors occurred.'.. Retry count 2.
09:40:21:Get resource token
09:40:21:Initializing AAD Library.
09:40:37:Got exception Caught exception while acquiring AAD token:
CorrelationId: '2ad27559-b347-47ce-b70e-9965c31c15c9'
Message: 'One or more errors occurred.'.. Retry count 3.
09:42:42:Get resource token
09:42:42:Initializing AAD Library.
09:42:58:Setting return value override to 'GetResourceTokenFailure'.
09:42:58:Exception while trying to get service resource claim: : Threw Exception.Type: SrsRestApiClientLib.SrsException, Exception.Message: Caught exception while acquiring AAD token:
CorrelationId: 'd6f9f95d-6cb9-4a73-b1e7-31dad4a64fc5'
Message: 'One or more errors occurred.'.
09:42:58:StackTrace: at SrsRestApiClientLib.Aad.GetToken()
at Microsoft.DisasterRecovery.Registration.AadBasedRegistrationClient.GetResourceToken(String managementCertThumbprint, String resourceId, String& idManagementUri)
at Microsoft.DisasterRecovery.Configurator.RegisterActionProcessor.FetchResourceToken()
09:42:58:InnerException.Type: System.AggregateException, InnerException.Message: One or more errors occurred.
09:42:58:InnerException.StackTrace: at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
at SrsRestApiClientLib.Aad.GetToken()
09:42:58:InnerException.Type: Microsoft.Identity.Client.MsalServiceException, InnerException.Message: A configuration issue is preventing authentication - check the error message from the server for details. You can modify the configuration in the application registration portal. See https://aka.ms/msal-net-invalid-client for details. Original exception: AADSTS700027: The certificate with identifier used to sign the client assertion is not registered on application. [Reason - The key was not found., Thumbprint of key used by client: 'FCEE85256F9E201EA37DF149A999914BADEF5F42', Please visit the Azure Portal, Graph Explorer or directly use MS Graph to see configured keys for app Id '91067b97-9cb7-4a70-80dc-e5122447fc45'. Review the documentation at https://docs.microsoft.com/en-us/graph/deployments to determine the corresponding service endpoint and https://docs.microsoft.com/en-us/graph/api/application-get?view=graph-rest-1.0&tabs=http to build a query request URL, such as 'https://graph.microsoft.com/beta/applications/91067b97-9cb7-4a70-80dc-e5122447fc45'].
Trace ID: fd49dc14-1e37-48cd-acb2-658cd3866b00
Correlation ID: d6f9f95d-6cb9-4a73-b1e7-31dad4a64fc5
Timestamp: 2023-03-12 11:43:12Z
09:42:58:InnerException.StackTrace: at Microsoft.Identity.Client.Internal.Requests.RequestBase.HandleTokenRefreshError(MsalServiceException e, MsalAccessTokenCacheItem cachedAccessTokenItem)
at Microsoft.Identity.Client.Internal.Requests.ClientCredentialRequest.<ExecuteAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Identity.Client.Internal.Requests.RequestBase.<RunAsync>d__12.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Identity.Client.ApiConfig.Executors.ConfidentialClientExecutor.<ExecuteAsync>d__3.MoveNext()
09:42:58:FetchResourceToken failed.
09:43:09:Application Ended
Any insight would be greatly appreciated.