Office 365 Connectivity Guidance

Updated 23 May 2019

Whilst our official guidance around Office 365 connectivity can be found at https://aka.ms/tune/ and also here it's worth discussing in detail what best practice looks like in this space.

An area I spend a great deal of time working on, is how to best connect to an Office 365 implementation from a corporate environment. It's an area I speak to customers about on a daily basis and the wealth of options can make it a highly complex area to get right. There isn't a one size fits all model either, what's right for one customer will be the wrong thing for another which makes generalizing advice hard and we need the details to make a correct recommendation.

There is also the issue that (in my view) there isn't enough information out there on Microsoft's global network infrastructure to allow customers to use this information to their advantage. This is something we've aim to redress in Ignite sessions and also here.

The other complexity revolves around the fact the cloud is a moving goal post for connectivity, its very nature means endpoints change regularly, applications may change their connectivity type as improvements are rolled out, new services are added, and so on. As such we need a network design which abstracts the organization from these changes, allowing the fluid nature of the cloud to be invisible to the end user whilst allowing the power of the services to be delivered optimally and enable the organization to consume service optimizations when they are rolled out.

What we need to do therefore is to drive for some standard connectivity principles, and by doing this we can achieve the 'north star' of optimal, flexible and abstract connectivity to our cloud services.

Our official principles can be found here, but I'll dig into them in more detail in this post.

So, what are these standard approach elements from a Microsoft standpoint?

  1. Differentiate traffic
  2. Egress Connections as close to the user as possible
  3. Optimize Route length and avoid network hairpins
  4. Assess Network Security

It's a simple list but important to know the detail on how to implement each stage to achieve success.

 

  1. Differentiate Traffic

Identify and differentiate Office 365 traffic using Microsoft published endpoints data

 

As part of the improvements Microsoft announced in 2018 to the way the required URLs and IPs for Office 365 are published, we have also changed how the URLs are categorized to help customers deliver best practice in terms of connecting to the required endpoints for the service.

In the old system, URLs are marked "Required" and "Optional" with a very large list of URLs in each category, published via an XML file.

Traditionally Microsoft advised that customers optimize connectivity to Office 365 for the "required" endpoints. That often means, not using traditional proxy methods at the network edge due to the bottlenecks which commonly occur at this point. This is often due to the load Office 365 puts on these devices. In addition, the work often done at this layer such as SSL inspection for AV scanning or DLP which is often replicated in the application/service itself when it comes to Office 365 endpoints. Proxying traffic also often means that in most cases, Skype for Business or Teams uses TCP for voice/media calls as opposed to its preferred method of using UDP directly, with a very likely drop in performance and quality. More on the challenges with using traditional proxies for SaaS services like Office 365 can be found in another of my blog posts.

The difficulty with this categorization model for the URLs (Required & Optional) is that applying this advice has become increasingly difficult for customers as the Office 365 service has grown. The number of endpoints in the "Required" list is considerable and in addition, some do not reside on Microsoft infrastructure (eg CDNs) which we therefore do not provide IPs for. A customer may therefore understandably want to apply SSL inspection on some of these endpoints due to the number and differing nature of them. It's also hard to correlate which URLs have which security features (such as AV scanning/DLP) applicable to it within the service, and thus don't require it at the network edge. The net result is that often key Office 365 traffic ends up being sent via an unoptimized path and encounters performance issues as a result, or it is a large challenge to optimize the full list of endpoints.

We therefore set about a piece of work to analyze all the URLs for the service and identify those which really require optimization to enable customers to focus on the traffic which really requires special consideration, i.e. those which are either very high volume in terms of bandwidth/transactions/connection count and thus put heavy load on proxies, or those which are particularly latency sensitive and require the direct path with minimal interference, for example Skype/Teams voice & video traffic.

This piece of work showed that if we concentrate on a very small number of endpoints and apply the highest levels of optimization then we have an enormous amount of impact on our Office 365 performance. This small list however accounts for 75-90% of Office 365 Bandwidth, connection and transaction count. These core endpoints are those which will put massive load on traditional proxy infrastructure and/or require that optimized path a direct egress best provides.

With this small list in mind, we can now work around the problem with traditional egress models in a much more precise manner, by concentrating on optimizing the URLs which both cause the problem (in terms of load) and also are the biggest victims of it in terms of performance.

We have therefore categorized the URLs into the following three categories to help customers deal with the endpoints in specific ways according to the needs of the endpoint and the business.

 

  • Optimize for a small number of endpoints that require low latency unimpeded connectivity which should bypass proxy servers, network SSL break and inspect devices, and network hairpins.
  • Allow for a larger number of endpoints that benefit from low latency unimpeded connectivity. Although not expected to cause failures, we also recommend bypassing proxy servers, network SSL break and inspect devices, and network hairpins. Good connectivity to these endpoints is required for Office 365 to operate normally.
  • Default for other Office 365 endpoints which can be directed to the default internet egress location for the company WAN.

 

Optimize Category

 

URLs in the Optimize category all have the following characteristics:

 

  • Are Microsoft owned and managed endpoints hosted on Microsoft infrastructure.
  • Have IPs provided
  • Are routable via ExpressRoute (If the circuit is authorized for this type of traffic).
  • Low rate of change to URLs/IPs compare to other two categories
  • Expected to remain low in number of URLs
  • Are High volume and/or latency sensitive

 

The Optimize list at the time of publishing is outlined below.

 

Endpoint to Optimize

Port/s

Use

https://outlook.office365.com

TCP 443

This is one of the Core URLs Outlook uses to connect to its Exchange Online server and has high volume of bandwidth usage and connection count. Low network latency is required for online features including: Instant search, Other mailbox calendars, Free / busy lookup, manage rules & alerts, Exchange online archive, Emails departing the outbox.

https://outlook.office.com

TCP 443

This is use for Outlook Online web access to connect to its Exchange Online server and network latency. Connectivity is particularly required for large file upload and download with SharePoint Online.

https://<tenant>.sharepoint.com

TCP 443

This is the primary URL for SharePoint Online and has high volume of bandwidth usage.

https://<tenant>-my.sharepoint.com

TCP 443

This is the primary URL for OneDrive for Business and has high volume of bandwidth and possibly high connection count from the OneDrive for Business Sync tool.

Skype for Business Media IPs (no URL)

TCP 443 & UDP 3478, 3479, 3480, and 3481

Relay Discovery allocation and real time traffic (3478), Audio (3479), Video (3480), and Video Screen Sharing (3481). These are the endpoints used for Skype for Business and Microsoft Teams Media traffic (Calls, meetings etc). Most endpoints are provided when the Skype for Business client or the Microsoft Teams client establishes a call (and are contained within the required IPs listed for the service). Skype for Business today requires DNS lookup for *.lync.com for this to work. UDP is required for optimal media quality.

  

 

The above (optimize) endpoints will require access to the following IP address ranges:

 

For all customers, the advice is that, for the endpoints noted above, where possible:

  • If proxies are in use, send the above URLs direct in the PAC file and allow a direct path through a firewall/NAT which has an allow list for the IPs and Ports listed.
  • Ensure devices and paths handling this traffic can scale to handle the demand in terms of port requirements and bandwidth.
  • Bypass or whitelist these URLs on network devices which do SSL decryption, DPI or content filtering, or any other interference of the traffic which could delay it.
  • Look at applying the security elements your require for these endpoints, within the service and thus allow the traffic out of the network without hindrance.
  • Egress these endpoints as close as possible to the users so the traffic can hit Microsoft's infrastructure locally.
  • For remote users ensure these endpoints are sent direct via split tunnelling if VPN solutions are used
  • Ensure that DNS resolution for these endpoints is done at the same location as the egress path

The advantage of the above list is we can send a customer's specific, and high-volume OneDrive/SharePoint traffic (tenantname.sharepoint.com & tenantname-my.sharepoint.com) direct, and apply DLP in the service if required, whilst sending any other tenant traffic (*.sharepoint.com) which may be required for sharing/collaboration, via an inspecting proxy.

Whilst most customers do not require it, if strict control of which tenants are accessible then tenant restrictions can be implemented to control which tenants can be accessed from within the corporate network.

By optimizing this list of endpoints, you can ensure that the traffic which is likely to put the most load on your network infrastructure, and/or is latency sensitive is treated in such a way that the service performs at its best, whilst removing the risk of overloading your standard network egress.

 

Allow Category

Access to the URLs marked as 'Allow' are required for Office 365 to function, however they are not as sensitive to network performance and latency as those marked as 'Optimize'. The amount of bandwidth these endpoints will consume is a small proportion of the whole figure and also connection counts will be lower than the optimize category.

Allow endpoints have the following characteristics:

  • Are required for the service to function
  • Are dedicated to Office 365 and hosted in Microsoft datacenters.
  • Are not as highly sensitive to network performance as the optimize category
  • Are not comparatively high bandwidth or connection count endpoints
  • Higher rate of change than the optimize category
  • Not all endpoints in this category will have IPs provided

Whilst these endpoints are required for the services to work and aren't as sensitive as the Optimize URLs, they still are critical to the running of the service. A heavy delay to connectivity to one of these endpoints could cause performance issues for users. As such it is recommended that these URLS are optimized as far as possible.

Recommended advice in an ideal situation is as follows:

  • If possible, bypass or whitelist Allow endpoints on network devices and services that perform traffic interception, SSL decryption, deep packet inspection and content filtering. If it is not possible to bypass these URLs, ensure the work is as light touch as possible, and on optimized, well scaled devices which minimize any delay.

  • If proxying this traffic, ensure proxy authentication for these URLs is disabled to remove any delay that may cause.

  • If possible, prioritize the evaluation of these endpoints as fully trusted by your network infrastructure and perimeter systems.

  • Egress these endpoints as close as possible to the users so the traffic can hit Microsoft's infrastructure locally.

  • Ensure that DNS resolution for these endpoints is done at the same location as the egress path

  • Prioritize these endpoints for SD-WAN integration for direct, minimal latency routing into the nearest Internet peering point of the Microsoft global network.

     

    Default Category

     

    This final category represents endpoints for Office 365 services and dependencies that do not require any optimization and can be treated by customer networks as normal Internet bound traffic.

     

    • Some endpoints in this category may not be hosted in Microsoft datacenters.
    • Some endpoints in this category may not have IPs provided for them (so require a proxy or other unrestricted path)
    • If these endpoints are not accessible, the services themselves will work generally, however certain elements will stop working. Eg help functions

     

    The information discussed above is be available from the new Web Service and I have published a script which will obtain this information for you using the latest data. Microsoft official script information can be found here.

    Next up – Part 2 Network Egress as close to the user as possible

Comments

  • Anonymous
    August 01, 2017
    Is there any way to determine which 0365 URLsare routable via ExpressRoute and those which must be routed via Internet?Thanks
  • Anonymous
    May 11, 2019
    Do i understand your article to say that Microsoft is against a client using an Express Route themselves for O365 traffic, because it may in fact point them to a data center that is further away?
    • Anonymous
      May 21, 2019
      Hi, Indeed, ExpressRoute is not the recommended connection model for Office 365 for the majority of customers, for the numerous reasons outlined. In some cases yes it can indeed cause higher latencies to the service than local breakout due to the distributed nature of Office 365 and the higher number of internet peering locations to ExpressRoute locations. We generally discuss the requirements with customers who feel they need ER so as to help them arrive at the correct connectivity model for the business, which invariably is the internet path.