Azure Stack Edge 2403 release notes
APPLIES TO: Azure Stack Edge Pro - GPUAzure Stack Edge Pro 2Azure Stack Edge Pro RAzure Stack Edge Mini R
The following release notes identify critical open issues and resolved issues for the 2403 release for your Azure Stack Edge devices. Features and issues that correspond to a specific model of Azure Stack Edge are called out wherever applicable.
The release notes are continuously updated, and as critical issues requiring a workaround are discovered, they're added. Before you deploy your device, carefully review the information contained in the release notes.
This article applies to the Azure Stack Edge 2403 release, which maps to software version 3.2.2642.2487.
Warning
In this release, you must update the packet core version to AP5GC 2308 before you update to Azure Stack Edge 2403. For detailed steps, see Azure Private 5G Core 2308 release notes. If you update to Azure Stack Edge 2403 before updating to Packet Core 2308.0.1, you will experience a total system outage. In this case, you must delete and re-create the Azure Kubernetes service cluster on your Azure Stack Edge device. Each time you change the Kubernetes workload profile, you are prompted for the Kubernetes update. Go ahead and apply the update.
Supported update paths
To apply the 2403 update, your device must be running version 2303 or later.
If you aren't running the minimum required version, you see this error:
Update package can't be installed as its dependencies aren't met.
You can update to 2303 from 2207 or later, and then update to 2403.
You can update to the latest version using the following update paths:
Current version of Azure Stack Edge software and Kubernetes | Update to Azure Stack Edge software and Kubernetes | Desired update to 2403 |
---|---|---|
2207 | 2303 | 2403 |
2209 | 2303 | 2403 |
2210 | 2303 | 2403 |
2301 | 2303 | 2403 |
2303 | Directly to | 2403 |
What's new
The 2403 release has the following new features and enhancements:
- Deprecated support for Azure Kubernetes service telemetry on Azure Stack Edge.
- Zone-label support for two-node Kubernetes clusters.
- Hyper-V VM management, memory usage monitoring on Azure Stack Edge host.
Issues fixed in this release
No. | Feature | Issue |
---|---|---|
1. | Clustering | Two-node cold boot of the server causes high availability VM cluster resources to come up as offline. Changed ColdStartSetting to AlwaysStart. |
2. | Marketplace image support | Fixed bug allowing Windows Marketplace image on Azure Stack Edge A and TMA. |
3. | Network connectivity | Fixed VM NIC link flapping after Azure Stack Edge host power off/on, which can cause VM losing its DHCP IP. |
4. | Network connectivity | Due to proxy ARP configurations in some customer environments, IP address in use check returns false positive even though no endpoint in the network is using the IP. The fix skips the ARP-based VM IP address in use check if the IP address is allocated from an internal network managed by Azure Stack Edge. |
5. | Network connectivity | VM NIC change operation times out after 3 hours, which blocks other VM update operations. On Microsoft Kubernetes clusters, Persistent Volume (PV) dependent pods get stuck. The issue occurs when multiple NICs within a VM are being transferred from a VLAN virtual network to a non-VLAN virtual network. After the fix, the VM NIC change operation times out quickly and the VM update won't be blocked. |
6. | Kubernetes | Overall two-node Kubernetes resiliency improvements, like increasing memory for control plane for AKS workload cluster, increasing limits for etcd, multi-replica, and hard anti-affinity support for core DNS and Azure disk csi controller pods and improve VM failover times. |
7. | Compute Diagnostic and Update | Resiliency fixes |
8. | Security | STIG security fixes for Mariner Guest OS for Azure Kubernetes service on Azure Stack Edge. |
9. | VM operations | On an Azure Stack Edge cluster that deploys an AP5GC workload, after a host power cycle test, when the host returns a transient error about CPU group configuration, AzSHostAgent would crash. This caused a VM operations failure. The fix made AzSHostAgent resilient to a transient CPU group error. |
Known issues in this release
No. | Feature | Issue | Workaround/comments |
---|---|---|---|
1. | Azure Storage Explorer | The Blob storage endpoint certificate that's autogenerated by the Azure Stack Edge device might not work properly with Azure Storage Explorer. | Replace the Blob storage endpoint certificate. For detailed steps, see Bring your own certificates. |
2. | Network connectivity | On a two-node Azure Stack Edge Pro 2 cluster with a teamed virtual switch for Port 1 and Port 2, if a Port 1 or Port 2 link is down, it can take up to 5 seconds to resume network connectivity on the remaining active port. If a Kubernetes cluster uses this teamed virtual switch for management traffic, pod communication may be disrupted up to 5 seconds. | |
3. | Virtual machine | After the host or Kubernetes node pool VM is shut down, there's a chance that kubelet in node pool VM fails to start due to a CPU static policy error. Node pool VM shows Not ready status, and pods won't be scheduled on this VM. | Enter a support session and ssh into the node pool VM, then follow steps in Changing the CPU Manager Policy to remediate the kubelet service. |
Known issues from previous releases
The following table provides a summary of known issues carried over from the previous releases.
No. | Feature | Issue | Workaround/comments |
---|---|---|---|
1. | Azure Stack Edge Pro + Azure SQL | Creating SQL database requires Administrator access. | Do the following steps instead of Steps 1-2 in Create-the-sql-database. 1. In the local UI of your device, enable compute interface. Select Compute > Port # > Enable for compute > Apply. 2. Download sqlcmd on your client machine from SQL command utility. 3. Connect to your compute interface IP address (the port that was enabled), adding a ",1401" to the end of the address. 4. Final command looks like this: sqlcmd -S {Interface IP},1401 -U SA -P "Strong!Passw0rd". After this, steps 3-4 from the current documentation should be identical. |
2. | Refresh | Incremental changes to blobs restored via Refresh are NOT supported | For Blob endpoints, partial updates of blobs after a Refresh, might result in the updates not getting uploaded to the cloud. For example, sequence of actions such as: 1. Create blob in cloud. Or delete a previously uploaded blob from the device. 2. Refresh blob from the cloud into the appliance using the refresh functionality. 3. Update only a portion of the blob using Azure SDK REST APIs. These actions can result in the updated sections of the blob to not get updated in the cloud. Workaround: Use tools such as robocopy, or regular file copy through Explorer or command line, to replace entire blobs. |
3. | Throttling | During throttling, if new writes to the device aren't allowed, writes by the NFS client fail with a "Permission Denied" error. | The error shows as below:hcsuser@ubuntu-vm:~/nfstest$ mkdir test mkdir: can't create directory 'test': Permission denied |
4. | Blob Storage ingestion | When using AzCopy version 10 for Blob storage ingestion, run AzCopy with the following argument: Azcopy <other arguments> --cap-mbps 2000 |
If these limits aren't provided for AzCopy, it could potentially send a large number of requests to the device, resulting in issues with the service. |
5. | Tiered storage accounts | The following apply when using tiered storage accounts: - Only block blobs are supported. Page blobs aren't supported. - There's no snapshot or copy API support. - Hadoop workload ingestion through distcp isn't supported as it uses the copy operation heavily. |
|
6. | NFS share connection | If multiple processes are copying to the same share, and the nolock attribute isn't used, you might see errors during the copy. |
The nolock attribute must be passed to the mount command to copy files to the NFS share. For example: C:\Users\aseuser mount -o anon \\10.1.1.211\mnt\vms Z: . |
7. | Kubernetes cluster | When applying an update on your device that is running a Kubernetes cluster, the Kubernetes virtual machines will restart and reboot. In this instance, only pods that are deployed with replicas specified are automatically restored after an update. | If you have created individual pods outside a replication controller without specifying a replica set, these pods won't be restored automatically after the device update. You must restore these pods. A replica set replaces pods that are deleted or terminated for any reason, such as node failure or disruptive node upgrade. For this reason, we recommend that you use a replica set even if your application requires only a single pod. |
8. | Kubernetes cluster | Kubernetes on Azure Stack Edge Pro is supported only with Helm v3 or later. For more information, go to Frequently asked questions: Removal of Tiller. | |
9. | Kubernetes | Port 31000 is reserved for Kubernetes Dashboard. Port 31001 is reserved for Edge container registry. Similarly, in the default configuration, the IP addresses 172.28.0.1 and 172.28.0.10, are reserved for Kubernetes service and Core DNS service respectively. | Don't use reserved IPs. |
10. | Kubernetes | Kubernetes doesn't currently allow multi-protocol LoadBalancer services. For example, a DNS service that would have to listen on both TCP and UDP. | To work around this limitation of Kubernetes with MetalLB, two services (one for TCP, one for UDP) can be created on the same pod selector. These services use the same sharing key and spec.loadBalancerIP to share the same IP address. IPs can also be shared if you have more services than available IP addresses. For more information, see IP address sharing. |
11. | Kubernetes cluster | Existing Azure IoT Edge marketplace modules might require modifications to run on IoT Edge on Azure Stack Edge device. | For more information, see Run existing IoT Edge modules from Azure Stack Edge Pro FPGA devices on Azure Stack Edge Pro GPU device. |
12. | Kubernetes | File-based bind mounts aren't supported with Azure IoT Edge on Kubernetes on Azure Stack Edge device. | IoT Edge uses a translation layer to translate ContainerCreate options to Kubernetes constructs. Creating Binds maps to hostpath directory and thus file-based bind mounts can't be bound to paths in IoT Edge containers. If possible, map the parent directory. |
13. | Kubernetes | If you bring your own certificates for IoT Edge and add those certificates on your Azure Stack Edge device after the compute is configured on the device, the new certificates aren't picked up. | To work around this problem, you should upload the certificates before you configure compute on the device. If the compute is already configured, Connect to the PowerShell interface of the device and run IoT Edge commands. Restart iotedged and edgehub pods. |
14. | Certificates | In certain instances, certificate state in the local UI might take several seconds to update. | The following scenarios in the local UI might be affected. - Status column in Certificates page. - Security tile in Get started page. - Configuration tile in Overview page. |
15. | Certificates | Alerts related to signing chain certificates aren't removed from the portal even after uploading new signing chain certificates. | |
16. | Web proxy | NTLM authentication-based web proxy isn't supported. | |
17. | Internet Explorer | If enhanced security features are enabled, you might not be able to access local web UI pages. | Disable enhanced security, and restart your browser. |
18. | Kubernetes | Kubernetes doesn't support ":" in environment variable names that are used by .NET applications. This is also required for Event Grid IoT Edge module to function on Azure Stack Edge device and other applications. For more information, see ASP.NET core documentation. | Replace ":" by double underscore. For more information, see Kubernetes issue |
19. | Azure Arc + Kubernetes cluster | By default, when resource yamls are deleted from the Git repository, the corresponding resources aren't deleted from the Kubernetes cluster. |
To allow the deletion of resources when they're deleted from the git repository, set --sync-garbage-collection in Arc OperatorParams. For more information, see Delete a configuration. |
20. | NFS | Applications that use NFS share mounts on your device to write data should use Exclusive write. That ensures the writes are written to the disk. | |
21. | Compute configuration | Compute configuration fails in network configurations where gateways or switches or routers respond to Address Resolution Protocol (ARP) requests for systems that don't exist on the network. | |
22. | Compute and Kubernetes | If Kubernetes is set up first on your device, it claims all the available GPUs. Hence, it isn't possible to create Azure Resource Manager VMs using GPUs after setting up the Kubernetes. | If your device has 2 GPUs, then you can create one VM that uses the GPU and then configure Kubernetes. In this case, Kubernetes will use the remaining available one GPU. |
23. | Custom script VM extension | There's a known issue in the Windows VMs that were created in an earlier release and the device was updated to 2103. If you add a custom script extension on these VMs, the Windows VM Guest Agent (Version 2.7.41491.901 only) gets stuck in the update causing the extension deployment to time out. |
To work around this issue: 1. Connect to the Windows VM using remote desktop protocol (RDP). 2. Make sure that the waappagent.exe is running on the machine: Get-Process WaAppAgent . 3. If the waappagent.exe isn't running, restart the rdagent service: Get-Service RdAgent | Restart-Service . Wait for 5 minutes.4. While the waappagent.exe is running, kill the WindowsAzureGuest.exe process. 5. After you kill the process, the process starts running again with the newer version. 6. Verify that the Windows VM Guest Agent version is 2.7.41491.971 using this command: Get-Process WindowsAzureGuestAgent | fl ProductVersion .7. Set up custom script extension on Windows VM. |
24. | Multi-Process Service (MPS) | When the device software and the Kubernetes cluster are updated, the MPS setting isn't retained for the workloads. | Re-enable MPS and redeploy the workloads that were using MPS. |
25. | Wi-Fi | Wi-Fi doesn't work on Azure Stack Edge Pro 2 in this release. | |
26. | Azure IoT Edge | The managed Azure IoT Edge solution on Azure Stack Edge is running on an older, obsolete IoT Edge runtime that is at end of life. For more information, see IoT Edge v1.1 EoL: What does that mean for me?. Although the solution doesn't stop working past end of life, there are no plans to update it. | To run the latest version of Azure IoT Edge LTSs with the latest updates and features on their Azure Stack Edge, we recommend that you deploy a customer self-managed IoT Edge solution that runs on a Linux VM. For more information, see Move workloads from managed IoT Edge on Azure Stack Edge to an IoT Edge solution on a Linux VM. |
27. | AKS on Azure Stack Edge | In this release, you can't modify the virtual networks once the AKS cluster is deployed on your Azure Stack Edge cluster. | To modify the virtual network, you must delete the AKS cluster, then modify virtual networks, and then recreate AKS cluster on your Azure Stack Edge. |
28. | AKS Update | The AKS Kubernetes update might fail if one of the AKS VMs isn't running. This issue might be seen in the two-node cluster. | If the AKS update has failed, Connect to the PowerShell interface of the device. Check the state of the Kubernetes VMs by running Get-VM cmdlet. If the VM is off, run the Start-VM cmdlet to restart the VM. Once the Kubernetes VM is running, reapply the update. |
29. | Wi-Fi | Wi-Fi functionality for Azure Stack Edge Mini R is deprecated. |