Hello @Praneeth Maddali ,
we were not able to resolve the problem yet, following your comment.
- We tried to re-introduce the routing problems, but always have other results.
- In one case, we waited more than 24 hours before removal of the old vnet range and sync the peering, which triggered the problem as it seems.
- Checking effective routes is not possible; app service vnet integration has no NIC to check
- Enable Route All is active
- Re-do vnet integration is not solving the problem entirely; after some time, it comes back
The problem is not persistent and in advance, even after removing the vnet and sync the peering and successfull testing, the problem can still occur for some of the app service instances.
We are now contacting our Microsoft contact person to solve this, currently we can't work with this behaviour.
Do you have any other idea, what we could check?
Lab "A" for testing the situation we already had to reproduce the error.
Following steps will be done to do this:
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- A new VNET range is introduced with new subnets, NSGs, UDRs✅
- Sync the peering (with terraform azapi ressource) ✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig ✅
- All app services are moved to the new subnet with the vnet integration ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Remove the old subnet, udr association, NSGs association and old NSGs ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Remove the old VNET ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Sync the peering (with terraform azapi ressource) ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅ Test OK, was working
- If the same problem occur with non-reachable destinations, re-creation of the peering and restart of apps services
- Restart app services via portal ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Scale up app services via portal ()Just one app service ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Switch to the new UDRs ✅
- Remove old UDRs ✅
- Restart app services ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- LabA removed 🔴
Lab "A2" for testing the situation we already had to reproduce the error, but with sync via portal
Following steps will be done to do this:
- Test to reach the firewall IP in the hub network via kudu with a ping✅
- A new VNET range is introduced with new subnets, NSGs, UDRs✅
- Sync the peering via portal ✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig ✅
- All app services are moved to the new subnet with the vnet integration ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Remove the old subnet, udr association, NSGs association and old NSGs ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Remove the old VNET ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Sync the peering via portal ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅ Test was fine
- If the same problem occur with non-reachable destinations, re-creation of the peering and restart of apps services
- LabA2 removed 🔴
Lab "B" for testing the situation to wait 24 hours between removing old vnet ranges.
Following steps will be done to do this:
- Test to reach the firewall IP in the hub network via kudu with a ping✅
- A new VNET range is introduced with new subnets, NSGs, UDRs✅
- Sync the peering (with terraform azapi ressource)✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig ✅
- All app services are moved to the new subnet with the vnet integration ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Wait 24 hours ✅
- Remove the old subnets and NSGs ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Remove the old vnet range ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- Sync the peering ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ❌ Problem happened, ping to firewall not possible
- Issue resolve workaround: Re-introduce old address space and re-sync ✅
- Removed old vnet range and do sync peering again ❌ - some app service worker are having a problem
- Waited over a weekend: Other workers are now affected ❌
Lab "C" for testing the situation to re-create the peering from start.
Following steps will be done to do this:
- Test to reach the firewall IP in the hub network via kudu with a ping✅
- A new VNET range is introduced with new subnets, NSGs, UDRs✅
- Remove the peering ✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig, should not work ✅
- All app services are moved to the new subnet with the vnet integration ✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig, should not work ✅
- Remove the old VNET, old subnet, udr association, NSGs association and old NSGs ✅
- Create the peering ✅
- Test to reach the firewall IP in the hub network via kudu with a ping ✅
- If the same problem occur with non-reachable destinations, wait 24 hours
- LabC removed 🔴
Lab "Q" for testing the situation to test the rollback to old vnet range and to check firewall logs
Following steps will be done to do this:
- Add firewall logs ✅
- A new VNET range is introduced with new subnets, NSGs, UDRs ✅
- All app services are moved to the new subnet with the vnet integration ✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig ❌
- Peering sync✅
- Test to reach the firewall IP in the hub network via kudu with a ping and do a dig✅
- Remove the old VNET, old subnet, udr association, NSGs association and old NSGs ✅
- Sync peering✅
- Test to reach the firewall IP in the hub network via kudu with a ping
- Should be problematically for some instances✴️ No error