Known issues - Azure Site Recovery on Azure Stack Hub

Caution

This article references CentOS, a Linux distribution that is nearing End Of Life (EOL) status. Please consider your use and plan accordingly. For more information, see the CentOS End Of Life guidance.

This article describes known issues for Azure Site Recovery on Azure Stack Hub. Use the following sections for details about the current known issues and limitations in Azure Site Recovery on Azure Stack Hub.

Re-protection: available data disk slots on appliance

  1. Ensure the appliance VM has enough data disk slots, as the replica disks for re-protection are attached to the appliance.

  2. The initial allowed number of disks being re-protected at the same time is 31. The default size of the appliance created from the marketplace item is Standard_DS4_v2, which supports up to 32 data disks, and the appliance itself uses one data disk.

  3. If the sum of the protected VMs is greater than 31, perform one of the following actions:

    • Split the VMs that require re-protection into smaller groups to ensure that the number of disks re-protected at the same time doesn't exceed the maximum number of data disks the appliance supports.
    • Increase the size of the Azure Site Recovery appliance VM.

    Note

    We do not test and validate large VM SKUs for the appliance VM.

  4. If you're trying to re-protect a VM, but there aren't enough slots on the appliance to hold the replication disks, the error message An internal error occurred displays. You can check the number of the data disks currently on the appliance, or sign in to the appliance, go to Event Viewer, and open logs for Azure Site Recovery under Applications and Services Logs:

    Sample screenshot of Event Viewer for Azure Site Recovery.

    Sample screenshot of Azure Site Recovery logs.

    Find the latest warning to identify the issue.

Linux VM kernel version not supported

  1. Check your kernel version by running the command uname -r.

    Sample screenshot of Linux Kernel version.

    For more information about supported Linux kernel versions, see Azure to Azure support matrix.

  2. With a supported kernel version, the failover, which causes the VM to perform a restart, can cause the failed-over VM to be updated to a newer kernel version that may not be supported. To avoid an update due to a failover VM restart, run the command sudo apt-mark hold linux-image-azure linux-headers-azure so that the kernel version update can proceed.

  3. For an unsupported kernel version, check for an older kernel version to which you can roll back, by running the appropriate command for your VM:

    • Debian/Ubuntu: dpkg --list | grep linux-image
    • RedHat/CentOS/RHEL: rpm -qa kernel

    The following image shows an example in an Ubuntu VM on version 5.4.0-1103-azure, which is unsupported. After the command runs, you can see a supported version, 5.4.0-1077-azure, which is already installed on the VM. With this information, you can roll back to the supported version.

    Sample screenshot of an Ubuntu VM kernel version check.

  4. Roll back to a supported kernel version using these steps:

    1. First, make a copy of /etc/default/grub in case there's an error; for example, sudo cp /etc/default/grub /etc/default/grub.bak.

    2. Then, modify /etc/default/grub to set GRUB_DEFAULT to the previous version that you want to use. You might have something similar to GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-1077-azure".

      Sample screenshot of an Ubuntu VM kernel version rollback.

    3. Select Save to save the file, then select Exit.

    4. Run sudo update-grub to update the grub.

    5. Finally, reboot the VM and continue with the rollback to a supported kernel version.

  5. If you don't have an old kernel version to which you can roll back, wait for the mobility agent update so that your kernel can be supported. The update is completed automatically, if it's ready, and you can check the version on the portal to confirm:

    Sample screenshot of mobility agent update check.

Re-protect manual resync isn't supported yet

After the re-protect job is complete, the replication is started in sequence. During replication, there may be cases that require a resync, which means a new initial replication is triggered to synchronize all the changes.

There are two types of resync:

  • Automatic resync. Requires no user action and is done automatically. Users can see some events shown on the portal:

    Sample screenshot of Automatic resync on the Users portal.

  • Manual resync. Requires user action to trigger the resync manually and is needed in the following instances:

    • The storage account chosen for the reprotect is missing.

    • The replication disk on the appliance is missing.

    • The replication write exceeds the capacity of the replication disk on the appliance.

      Tip

      You can also find the manual resync reasons in the events blade to help you decide whether a manual resync is required.

Known issues in PowerShell automation

  • If you leave $failbackPolicyName and $failbackExtensionName empty or null, the re-protect can fail. See the following examples:

    Sample screenshot of a VM failed to perform operation error.

    Sample screenshot of second operation error on a different VM.

  • Always specify the $failbackPolicyName and $failbackExtensionName, as shown in the following example:

    $failbackPolicyName = "failback-default-replication-policy"
    $failbackExtensionName = "default-failback-extension"
    $parameters = @{
        "properties" = @{
            "customProperties" = @{
                "instanceType" = "AzStackToAzStackFailback"
                "applianceId" = $applianceId
                "logStorageAccountId" = $LogStorageAccount.Id
                "policyName" = $failbackPolicyName
                "replicationExtensionName" = $failbackExtensionName
            }
        }
    }
    $result = Invoke-AzureRmResourceAction -Action "reprotect" ` -ResourceId $protectedItemId ` -Force -Parameters $parameters 
    

Mobility service agent warning

When replicating multiple VMs, you might see the Protected item health changed to Warning error in the Site Recovery jobs.

Sample screenshot of the Protected item health change warning.

This error message should only be a warning and is not a blocking issue for the actual replication or failover processes.

Tip

You can check the the state of the respective VM to ensure it's healthy.

Next steps