Hello @Kopl ,
We are glad that you are able to achieve more than required performance improvement using App GW.
Basically http application routing addon is not recommended for production workloads for sure:
https://learn.microsoft.com/en-us/azure/aks/http-application-routing
When you use AGIC enabled AKS cluster with App GW , the number of hops from the request to the application pod will less when compared to http-application-routing-addon.
More details here: - https://learn.microsoft.com/en-us/samples/azure/azure-quickstart-templates/aks-application-gateway-ingress-controller/
Brownfield deployment details about AGIC : https://azure.github.io/application-gateway-kubernetes-ingress/setup/install-existing/
One additional best practice to consider is to make use of Ephemeral OS disks: https://learn.microsoft.com/en-us/azure/aks/cluster-configuration#ephemeral-os
Additional information about Ephemeral OS disk in AKS so that we can understand better:
With ephemeral OS disks, you will have a lower read/write latency on the particular OS disk which are being used by AKS agent nodes since the disk is locally attached and You will also get faster cluster operations like scale or upgrade. Ephemeral OS disks are free; you incur no storage cost for OS disks, because data written to OS disk is stored on local VM storage and isn't persisted to Azure Storage (In the normal managed OS disks - the corresponding OS disk will be stored in a storage account behind the scenes).
Having said that , not all the VM sizes can support ephemeral OS disk because every VM size has it's own configuration.
Please readout the Ephemeral OS section from the document https://learn.microsoft.com/en-us/azure/aks/cluster-configuration#ephemeral-os and kindly understand .
When using ephemeral OS, the OS disk must fit in the VM cache. The sizes for VM cache are available in the Azure documentation(https://learn.microsoft.com/en-us/azure/virtual-machines/dv3-dsv3-series) in parentheses next to IO throughput ("cache size in GiB").
Using the AKS default VM size Standard_DS2_v2 with the default OS disk size of 100GB as an example, this VM size supports ephemeral OS but only has 86GB of cache size. This configuration would default to managed disks if the user does not specify explicitly. If a user explicitly requested ephemeral OS, they would receive a validation error.
If a user requests the same Standard_DS2_v2 with a 60GB OS disk, this configuration would default to ephemeral OS: the requested size of 60GB is smaller than the maximum cache size of 86GB.
Using Standard_D8s_v3 with 100GB OS disk, this VM size supports ephemeral OS and has 200GB of cache space. If a user does not specify the OS disk type, the node pool would receive ephemeral OS by default.
The latest generation of VM series does not have a dedicated cache, but only temporary storage. Let's assume to use the Standard_E2bds_v5 VM size with the default OS disk size of 100 GiB as an example. This VM size supports ephemeral OS disks but only has 75 GiB of temporary storage. This configuration would default to managed OS disks if the user does not specify explicitly. If a user explicitly requested ephemeral OS disks, they would receive a validation error.
If a user requests the same Standard_E2bds_v5 VM size with a 60 GiB OS disk, this configuration would default to ephemeral OS disks: the requested size of 60 GiB is smaller than the maximum temporary storage of 75 GiB.
Using Standard_E4bds_v5 with 100 GiB OS disk, this VM size supports ephemeral OS and has 150 GiB of temporary storage. If a user does not specify the OS disk type, the node pool would receive ephemeral OS by default.
Hope the above explanation will help you out in better understanding of why you are receiving the error message when trying to add Bms SKU with ephemeral OS.
Kindly let us know if you have additional questions, happy to help out with couple of more examples.
Regards,
Shiva.