Express Route Routing Issues (Azure to On-premises route)
Hi @GitaraniSharma-MSFT - We have performed the same setup from this article https://learn.microsoft.com/en-us/answers/questions/860533/express-route-and-azure-firewall)
We have 2 express route premium circuits (East US & South-Central US) with 3 Azure firewalls premium per vNet; 3 Express route gateways (multi-AZ) (per vNet); 6 express route connections to 3 express route gateways (DR setup if incase circuit/region failure).
The identicial 10 prefixes were advertised from on-premises side without "0.0.0.0/0" and 3 virtual networks from Azure side (no hub-spoke approach) to on-premises. Also, outbound traffic of internet from Azure has to go through Azure firewall and not to On-premises.
- Traffic flow from On-Premises to Azure >> Working as expected and passing the traffic through Azure Firewalls as per environment.
- Traffic flow from Azure to On-Premises >> Intermittently working. Example - telnet/psping to on-premises destination on port is working on 1st or 2nd attempts and stuck after 3rd attempt and continues the same behavior for some attempts. It will work back after 10th attempt or so. Also, observed the INVALID flag on firewall traceflowlogs from on premise destination server to FW private IP. However, we need the firewall to filter both inbound/outbound traffic from on-premises and internet side.
Azure Firewall
Azure Virtual Network
Azure ExpressRoute
-
KapilAnanth-MSFT 41,566 Reputation points • Microsoft Employee
2024-07-01T06:18:38.6266667+00:00 Welcome to Microsoft Q&A Platform. Thank you for reaching out & hope you are doing well.
Wrt, "outbound traffic of internet from Azure has to go through Azure firewall and not to On-premises."
- As long as you are not advertising 0.0.0.0/0 via ExpressRoute and there are UDRs pointing 0.0.0.0/0 to Azure Firewall IP, this should be fine.
Wrt, "Traffic flow from Azure to On-Premises >> Intermittently working"
- If you want Azure firewall to filter Traffic flow from Azure to On-Premises, make sure the UDRs used are the exact range of the OnPrem Network and not 0.0.0.0/0
- See : How Azure selects a route
- Longest prefix match algorithm would route OnPrem Network directly to Gateway without routing via the Firewall.
- Can you confirm if this is how the setup is configured.
- Once this is done,
- You informed there are intermittent failures of ICMP/TCPPing
- Do you see the traffic in Azure Firewall? During failure or not?
Cheers,
Kapil
-
Jaykishan Bairagi 0 Reputation points
2024-07-01T12:07:59.9866667+00:00 Hi Kapil,
Thanks for quick reply.
Wrt, "outbound traffic of internet from Azure has to go through Azure firewall and not to On-premises." ->
All workload subnets >> Single Route Table (Outbound) >> 0.0.0.0/0 UDR + All on-premises network ranges >> Next Hop >> Virtual appliance >> Azure FW Private IP -> Does this work?
Wrt, "Traffic flow from Azure to On-Premises >> Intermittently working"
The above route table is single route table with all UDRs for Internet outbound from Azure + outbound to express route through Azure firewall.
- Yes, we can see the outbound traffic to express route is listed in the Azure Firewall. However, the on-premises where I'm telnetting to is sending an INVALID flag to Azure Firewall subnet. This is same for all on-premises destinations. Inbound to Azure from on-premise doesn't have any issues and logged in Azure Firewall
- I tried removing the 0.0.0.0/0 from the routetable and has the on-premises destination range DESTINATIONIP/16 >> same issue
- I've even added just the /32 of destination ip to the route table >> same issue
- Output of psping ran from one of the windows virtual machine to destination (on-premises server) Connecting to DESTINATIONIP:16722 (warmup): from 0.0.0.0:58849: This operation returned because the timeout period expired. Connecting to DESTINATIONIP:16722: Sent = 0, Received = 0, Lost = 0 (0% loss), Minimum = 0.00ms, Maximum = 0.00ms, Average = 0.00ms
-
KapilAnanth-MSFT 41,566 Reputation points • Microsoft Employee
2024-07-01T13:26:02.4366667+00:00 All workload subnets >> Single Route Table (Outbound) >> 0.0.0.0/0 UDR + All on-premises network ranges >> Next Hop >> Virtual appliance >> Azure FW Private IP -> Does this work?
- This is correct
- This should send internet traffic directly to Internet from Firewall
- OnPrem traffic will be sent to OnPrem via Firewall
P.S: Make sure there is a UDR at the GatewaySubnet also, pointing Azure traffic to the firewall.
Wrt "Yes, we can see the outbound traffic to express route is listed in the Azure Firewall. However, the on-premises where I'm telnetting to is sending an INVALID flag to Azure Firewall subnet"
- Then this looks like a configuration issue from the OnPrem server(s) only
- Did you check why the OnPrem server would reply with a "INVALID flag" from local logs.
- Is the packet from Azure to OnPrem reaching the OnPrem server (PC) or just reaching the OnPrem device (Router/Firewall) ?
Only reason I can think of is : Suboptimal routing between virtual networks.
Make sure the ExpressRoute Gateway uses the ExpressRoute Circuit in the same region by assigning a higher weight to the local connection than to the remote
Cheers,
Kapil
-
Jaykishan Bairagi 0 Reputation points
2024-07-01T14:22:30.38+00:00 Hi @KapilAnanth-MSFT -
#1. P.S: Make sure there is a UDR at the GatewaySubnet also, pointing Azure traffic to the firewall. ->
On the Inbound Routetable [Propagate gateway routes = YES] >> Express route gateway subnet (attached) >> UDR - Our vNet address destined from on-premises >> Next Hop >> Virtual Appliance >> Azure Firewall private IP.
So, that's current route table. I tried now adding a UDR of on-premise one range where the destination server is and still receiving the "Connecting to DESTINATIONIP:16722 (warmup): from 0.0.0.0:58849: This operation returned because the timeout period expired. Connecting to DESTINATIONIP:16722: Sent = 0, Received = 0, Lost = 0 (0% loss), Minimum = 0.00ms, Maximum = 0.00ms, Average = 0.00ms"
-
Jaykishan Bairagi 0 Reputation points
2024-07-01T14:40:58.1+00:00 P.S: Make sure there is a UDR at the GatewaySubnet also, pointing Azure traffic to the firewall. >> Yes, I've a UDR like below for all express route inbound traffic -
inbound route table [Propagate gateway routes = YES] >> UDR - vNet range /23 >> next hop >> virtual appliance >> azure firewall private IP. >> working from on-premises to connect to Azure
Also, tried this just now -
inbound route table [Propagate gateway routes = NO] >> UDR - vNet range /23 >> next hop >> virtual appliance >> azure firewall private IP. >> not working both directions.
Also, tried this just now -
inbound route table [Propagate gateway routes = YES] >> UDR - vNet range /23 + on-premises destination server network range >> next hop >> virtual appliance >> azure firewall private IP. >> not working both directions.
I noticed if I remove the "Express route virtual network gateway" from the inbound route table then I can successfully psping from Azure to On-Premises without any issues but On-premises can't reach to Azure servers. looks like only one route is working.
Make sure the ExpressRoute Gateway uses the ExpressRoute Circuit in the same region by assigning a higher weight to the local connection than to the remote >> Yes, we already have the weights assigned to local connections to "100" as a local regional circuit preference and the fallback connections to other region circuits as "0"
I've gone through this article suspecting something like this asymmetric routing not sure how to fix this but this matches our problem here-
from the below article reference-
"For Instance, let’s say in the below example, SourceIP 10.10.0.68 is trying to connect to the Destination IP 10.10.0.132 on destination Port RDP 3389 with a random source port 51369. We have set up an asymmetric routing scenario where we made sure that this traffic is not routed through the firewall, however, the return traffic is configured to go through the firewall that is from 10.10.0.132 to 10.10.0.68 with source port 3389 and destination port 51369. In this case, by using the Flow Trace Logs, we can clearly see if there is an invalid packet that is not recognized by the Firewall that could be causing the connection issues. "
-
KapilAnanth-MSFT 41,566 Reputation points • Microsoft Employee
2024-07-02T15:34:39.1533333+00:00 The first UDR for Inbound (on GatewaySubnet) is the correct approach.
Wrt "Invalid Flag",
- Can you confirm if you saw the initial SYN Flag from the Firewall in Network Rules
- As SYN packets aren't logged by default for Flow Trace logs
- Or you never saw the SYN Packet for Azure to OnPrem and only see INVALID Flag for the return traffic (OnPrem to Azure IP/Port )?
- This should tell us which direction is causing the issue
Cheers,
Kapil
- Can you confirm if you saw the initial SYN Flag from the Firewall in Network Rules
-
Jaykishan Bairagi 0 Reputation points
2024-07-02T20:55:27.75+00:00 Yes, I saw SYN flag which is in "Network rules" table.. we have a rule to allow the traffic on port and destination ip range and from the "Flow trace logs" results I see the "SYN-ACK", "FIN" exchanged between Azure and Destination servers. However, when the telnet stuck or timed out I see the on-premise destination sever sending INVALID flag to Azure Firewall private IP range.
When we performed the tcpdumps on source/destination sides we noticed Azure/Destination servers are "re-transmitting" and sending "RST".
From on-premises side - No AS-Path prepending is used; we verified the south central/East ER circuit route forward/return traffic route is same.
-
KapilAnanth-MSFT 41,566 Reputation points • Microsoft Employee
2024-07-03T06:03:35.1433333+00:00 I am trying to see if we see a "INVALID" for a response traffic.
To be precise,
- Consider a single SYN Packet from Firewall Logs and note it's source Port
- Do you see the "INVALID" for a the same port(now destination) reply packet from OnPrem.
Yes, if there are retransmissions, then there is a chance that you see "INVALID" flows.
In case the above does not help,
- And to troubleshoot further, we will need a specialized 1:1 session, where a support engineer can have a screen share session to pinpoint the issue.
- If you have a support plan you may file a support ticket, else please do let us know, we will try and help you get a one-time free technical support.
Cheers,
Kapil
Sign in to comment