Use Azure Front Door to secure AKS workloads

Azure Front Door
Azure Key Vault
Azure Private Link
Azure Container Registry
Azure Kubernetes Service (AKS)

This article describes how to securely expose and protect a workload that runs in Azure Kubernetes Service (AKS) by using Azure Front Door, Azure Web Application Firewall, and an Azure Private Link service. This architecture uses the NGINX ingress controller to expose a web application. The NGINX ingress controller is configured to use a private IP address as a front-end IP configuration of the AKS internal load balancer. The deployment provides end-to-end Transport Layer Security (TLS) encryption.

Architecture

Diagram that shows an architecture that securely exposes and protects a workload that runs in AKS.

The Grafana logo is a trademark of its respective company. No endorsement is implied by the use of this mark.

Download a Visio file of this architecture.

Workflow architecture

The following diagram shows the steps for the message flow during deployment and runtime.

Diagram that shows the steps for the message flow during deployment and runtime.

Download a Visio file of this architecture.

Deployment workflow

You can use one of the following methods to deploy the NGINX ingress controller:

The following steps describe the deployment process. This workflow corresponds to the green numbers in the preceding diagram.

  1. A security engineer generates a certificate for the custom domain that the workload uses, and saves it in an Azure Key Vault. You can obtain a valid certificate from a well-known certification authority (CA).

  2. A platform engineer specifies the necessary information in the main.bicepparams Bicep parameters file and deploys the Bicep modules to create the Azure resources. The necessary information includes:

    • A prefix for the Azure resources.

    • The name and resource group of the existing Azure Key Vault that holds the TLS certificate for the workload hostname and the Azure Front Door custom domain.

    • The name of the certificate in the Key Vault.

    • The name and resource group of the DNS zone that's used to resolve the Azure Front Door custom domain.

  3. The deployment script creates the following objects in the AKS cluster:

  4. An Azure Front Door secret resource is used to manage and store the TLS certificate that's in the Key Vault. This certificate is used by the custom domain that's associated with the Azure Front Door endpoint. The Azure Front Door profile uses a user-assigned managed identity with the Key Vault Administrator role assignment to retrieve the TLS certificate from Key Vault.

Note

At the end of the deployment, you need to approve the private endpoint connection before traffic can pass to the origin privately. For more information, see Secure your origin with Private Link in Azure Front Door Premium. To approve private endpoint connections, use the Azure portal, the Azure CLI, or Azure PowerShell. For more information, see Manage a private endpoint connection.

Runtime workflow

The following steps describe the message flow for a request that an external client application initiates during runtime. This workflow corresponds to the orange numbers in the preceding diagram.

  1. The client application uses its custom domain to send a request to the web application. The DNS zone that's associated with the custom domain uses a CNAME record to redirect the DNS query for the custom domain to the original hostname of the Azure Front Door endpoint.

  2. Azure Front Door traffic routing occurs in several stages. Initially, the request is sent to one of the Azure Front Door points of presence. Then Azure Front Door uses the configuration to determine the appropriate destination for the traffic. Various factors can influence the routing process, such as the Azure Front Door caching, web application firewall (WAF), routing rules, rules engine, and caching configuration. For more information, see Routing architecture overview.

  3. Azure Front Door forwards the incoming request to the Azure private endpoint that's connected to the Private Link service that exposes the AKS-hosted workload.

  4. The request is sent to the Private Link service.

  5. The request is forwarded to the kubernetes-internal AKS internal load balancer.

  6. The request is sent to one of the agent nodes that hosts a pod of the managed or unmanaged NGINX ingress controller.

  7. One of the NGINX ingress controller replicas handles the request.

  8. The NGINX ingress controller forwards the request to one of the workload pods.

Components

The architecture consists of the following components:

  • A public or private AKS cluster is composed of the following node pools:

    • A system node pool in a dedicated subnet. The default node pool hosts only critical system pods and services. The system nodes have a node taint, so application pods can't be scheduled on this node pool.

    • A user node pool that hosts user workloads and artifacts in a dedicated subnet.

  • The deployment requires role-based access control (RBAC) role assignments, including:

    • A Grafana Admin role assignment on Azure Managed Grafana for the Microsoft Entra user whose objectID is defined in the userId parameter. The Grafana Admin role provides full control of the instance, including managing role assignments, viewing, editing, and configuring data sources. For more information, see How to share access to Azure Managed Grafana.

    • A Key Vault Administrator role assignment on the existing Key Vault resource that contains the TLS certificate for the user-defined managed identity that the Key Vault provider for Secrets Store CSI Driver uses. This assignment provides access to the CSI driver so that it can read the certificate from the source Key Vault.

  • Azure Front Door Premium is a Layer-7 global load balancer and modern cloud content delivery network. It provides fast, reliable, and secure access between your users' and your applications' static and dynamic web content across the globe. You can use Azure Front Door to deliver your content by using Microsoft's global edge network. The network has hundreds of global and local points of presence distributed around the world. So you can use points of presence that are close to your enterprise and consumer customers.

    In this solution, Azure Front Door is used to expose an AKS-hosted sample web application via a Private Link service and the NGINX ingress controller. Azure Front Door is configured to expose a custom domain for the Azure Front Door endpoint. The custom domain is configured to use the Azure Front Door secret that contains a TLS certificate that's read from Key Vault.

  • Azure Web Application Firewall protects the AKS-hosted applications that are exposed via Azure Front Door from common web-based attacks, such as the Open Web Application Security Project (OWASP) vulnerabilities, SQL injections, and cross-site scripting. This cloud-native, pay-as-you-use technology doesn't require licensing. Azure Web Application Firewall provides protection for your web applications and defends your web services against common exploits and vulnerabilities.

  • An Azure DNS zone is used for the name resolution of the Azure Front Door custom domain. You can use Azure DNS to host your DNS domain and manage your DNS records.

    • The CNAME record is used to create an alias or pointer from one domain name to another. You can configure a CNAME record to redirect DNS queries for the custom domain to the original hostname of the Azure Front Door endpoint.

    • The Text (TXT) record contains the validation token for the custom domain. You can use a TXT record within a DNS zone to store arbitrary text information that's associated with a domain.

  • A Private Link service is configured to reference the kubernetes-internal internal load balancer of the AKS cluster. When you enable Private Link to your origin in Azure Front Door Premium, Azure Front Door creates a private endpoint from an Azure Front Door-managed regional private network. You receive an Azure Front Door private endpoint request at the origin for your approval. For more information, see Secure your origin with Private Link in Azure Front Door Premium.

  • Azure Virtual Network is used to create a single virtual network with six subnets:

    • SystemSubnet is used for the agent nodes of the system node pool.

    • UserSubnet is used for the agent nodes of the user node pool.

    • PodSubnet is used to dynamically allocate private IP addresses to pods when the AKS cluster is configured to use Azure content networking interface with dynamic IP allocation.

    • ApiServerSubnet uses API server virtual network integration to project the API server endpoint directly into this delegated subnet where the AKS cluster is deployed.

    • AzureBastionSubnet is used for the Azure Bastion host.

    • VmSubnet is used for the jumpbox virtual machine (VM) that connects to the private AKS cluster and for the private endpoints.

  • A user-assigned managed identity is used by the AKS cluster to create more resources like load balancers and managed disks in Azure.

  • Azure Virtual Machines is used to create an optional jumpbox VM in the VMSubnet.

  • An Azure Bastion host is deployed in the AKS cluster virtual network to provide Secure Socket Shell (SSH) connectivity to the AKS agent nodes and VMs.

  • An Azure Storage account is used to store the boot diagnostics logs of both the service provider and service consumer VMs. Boot diagnostics is a debugging feature that you can use to view console output and screenshots to diagnose the VM status.

  • Azure Container Registry is used to build, store, and manage container images and artifacts.

  • Key Vault is used to store secrets, certificates, and keys. Pods can use Key Vault provider for Secrets Store CSI Driver to mount secrets, certificates, and keys as files.

    For more information, see Use the Key Vault provider for Secrets Store CSI Driver in an AKS cluster and Provide an identity to access the Key Vault provider for Secrets Store CSI Driver.

    In this project, an existing Key Vault resource contains the TLS certificate that the ingress Kubernetes object and the custom domain of the Azure Front Door endpoint use.

  • An Azure private endpoint and an Azure private DNS zone are created for each of the following resources:

    • Container Registry
    • Key Vault
    • A Storage account
  • Azure network security groups are used to filter inbound and outbound traffic for the subnets that host VMs and Azure Bastion hosts.

  • An Azure Monitor workspace is a unique environment for data that Monitor collects. Each workspace has its own data repository, configuration, and permissions. Azure Monitor Logs workspaces contain logs and metrics data from multiple Azure resources, whereas Monitor workspaces contain metrics related to Prometheus only.

    You can use managed service for Prometheus to collect and analyze metrics at scale by using a Prometheus-compatible monitoring solution that's based on Prometheus. You can use the Prometheus query language (PromQL) to analyze and alert on the performance of monitored infrastructure and workloads without having to operate the underlying infrastructure.

  • An Azure Managed Grafana instance is used to visualize the Prometheus metrics that the Bicep module-deployed AKS cluster generates. You can connect your Monitor workspace to Azure Managed Grafana, and use a set of built-in and custom Grafana dashboards to visualize Prometheus metrics. Grafana Enterprise supports Azure Managed Grafana, which provides extensible data visualizations. You can quickly and easily deploy Grafana dashboards that have built-in high availability. You can also use Azure security measures to control access to the dashboards.

  • An Azure Monitor Logs workspace is used to collect the diagnostic logs and metrics from Azure resources, including:

    • AKS clusters
    • Key Vault
    • Azure network security groups
    • Container Registry
    • Storage accounts
  • A Bicep deployment script is used to run a Bash script that creates the following objects in the AKS cluster:

Alternatives

To automatically create a managed Private Link service to the AKS cluster load balancer, you can use the Private Link service feature. To provide private connectivity, you must create private endpoint connections to your service. You can use annotations to expose a Kubernetes service via a Private Link service. The architecture in this article manually creates a Private Link service to reference the cluster Azure Load Balancer.

Scenario details

This scenario uses Azure Front Door Premium, end-to-end TLS encryption, Azure Web Application Firewall, and a Private Link service to securely expose and protect a workload that runs in AKS.

This architecture uses the Azure Front Door TLS and Secure Sockets Layer (SSL) offload capability to terminate the TLS connection and decrypt the incoming traffic at the Front Door. The traffic is reencrypted before it's forwarded to the origin, which is a web application that's hosted in an AKS cluster. HTTPS is configured as the forwarding protocol on Azure Front Door when Azure Front Door connects to the AKS-hosted workload that's configured as an origin. This practice enforces end-to-end TLS encryption for the entire request process, from the client to the origin. For more information, see Secure your origin with Private Link in Azure Front Door Premium.

The NGINX ingress controller exposes the AKS-hosted web application. The NGINX ingress controller is configured to use a private IP address as a front-end IP configuration of the kubernetes-internal internal load balancer. The NGINX ingress controller uses HTTPS as the transport protocol to expose the web application. For more information, see Create an ingress controller by using an internal IP address.

The AKS cluster is configured to use the following features:

  • API server virtual network integration provides network communication between the API server and the cluster nodes. This feature doesn't require a private link or tunnel. The API server is available behind an internal load balancer VIP in the delegated subnet. The cluster nodes are configured to use the delegated subnet.

    You can use API server virtual network integration to ensure that the network traffic between your API server and your node pools remains on the private network only. AKS clusters that have API server virtual network integration provide many advantages. For example, you can enable or disable public network access or private cluster mode without redeploying the cluster. For more information, see Create an AKS cluster with API server virtual network integration.

  • Azure NAT Gateway manages outbound connections that AKS-hosted workloads initiate. For more information, see Create a managed or user-assigned NAT gateway for your AKS cluster.

Potential use cases

This scenario provides a solution to meet security and compliance requirements for a web application or REST API that runs in AKS.

Considerations

These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that can be used to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.

Some of the following considerations aren't specifically related to the use of Azure Front Door, Azure Web Application Firewall, and a Private Link service to improve the security of an AKS cluster. But the security, performance, availability, reliability, storage, and monitoring considerations are essential requirements of this solution.

Reliability

Reliability ensures your application can meet the commitments you make to your customers. For more information, see Design review checklist for Reliability.

These recommendations are essential for single-tenant AKS solutions and aren't specific to multitenant AKS solutions, where the reliability targets are higher due to the number of users and workloads that rely on the system. Consider the following recommendations to optimize the availability of your AKS cluster and workloads.

Intra-region resiliency

  • Deploy the node pools of your AKS cluster across all availability zones in a region.

  • Enable zone redundancy in Container Registry for intra-region resiliency and high availability.

  • Use topology spread constraints to control how you spread pods across your AKS cluster among failure domains like regions, availability zones, and nodes.

  • Use the Standard or Premium tier for your production AKS clusters. These tiers include the uptime service-level agreement (SLA) feature, which guarantees 99.95% availability of the Kubernetes API server endpoint for clusters that use availability zones and 99.9% availability for clusters that don't use availability zones. For more information, see Free, Standard, and Premium pricing tiers for AKS cluster management.

  • Enable zone redundancy if you use Container Registry to store container images and Oracle Cloud Infrastructure (OCI) artifacts. Container Registry supports optional zone redundancy and geo-replication. Zone redundancy provides resiliency and high availability to a registry or replication resource (replica) in a specific region. Geo-replication replicates registry data across one or more Azure regions to provide availability and reduce latency for regional operations.

Disaster recovery and business continuity

  • Consider deploying your solution to two regions. Use the paired Azure region as the second region.

  • Script, document, and periodically test regional failover processes in a quality assurance (QA) environment.

  • Test failback procedures to validate that they work as expected.

  • Store your container images in Container Registry. Geo-replicate the registry to each region where you deploy your AKS solution.

  • Don't store service state in a container if possible. Instead, store service state in an Azure platform as a service (PaaS) storage solution that supports multiregion replication. This approach improves resiliency and simplifies disaster recovery because you can preserve each service's critical data across regions.

  • Prepare and test processes to migrate your storage from the primary region to the backup region if you use Storage.

Security

Security provides assurances against deliberate attacks and the abuse of your valuable data and systems. For more information, see Design review checklist for Security.

  • Use a WAF to protect AKS-hosted web applications and services that expose a public HTTPS endpoint. You need to provide protection from common threats like SQL injection, cross-site scripting, and other web exploits. Follow OWASP rules and your own custom rules.

    Azure Web Application Firewall provides improved centralized protection of your web applications from common exploits and vulnerabilities. You can deploy an Azure WAF with Azure Application Gateway, Azure Front Door, or Azure Content Delivery Network.

  • Use Azure DDoS Protection and application design best practices to defend against workload distributed denial-of-service (DDoS) attacks. Azure protects its infrastructure and services against DDoS attacks. This protection ensures the availability of regions, availability zones, and services. You should also protect your workload's public endpoints from DDoS attacks at Layer 4 and Layer 7. You can enable Azure DDOS Protection on perimeter virtual networks.

  • Use the Azure Web Application Firewall rate-limit rule for Azure Front Door to manage and control the number of requests that you allow from a specific source IP address to your application within a defined rate-limit duration. Use this feature to enforce rate-limiting policies and ensure that you protect your application from excessive traffic or potential abuse. Configure the rate-limit rule to maintain optimal application performance and security and provide fine-grained control of request limits.

  • Configure the WAF policy that's associated with Azure Front Door to prevention mode. In prevention mode, the WAF policy analyzes incoming requests and compares them to the configured rules. If a request matches one or more rules that are set to deny traffic when satisfied, the WAF policy blocks the malicious traffic from reaching your web applications. This measure helps ensure that you protect your applications against potential vulnerabilities and unauthorized access attempts. For more information, see Azure Web Application Firewall on Azure Front Door.

  • Create an Azure private endpoint for any PaaS service that AKS workloads use, like Key Vault, Azure Service Bus, and Azure SQL Database. The traffic between the applications and these services isn't exposed to the public internet. Traffic between the AKS cluster virtual network and an instance of a PaaS service via a private endpoint travels the Microsoft backbone network but doesn't pass by the Azure firewall. A private endpoint provides security and protection against data leakage. For more information, see What is Private Link?

  • Use a WAF policy to help protect public-facing AKS-hosted workloads from attacks when you use Application Gateway in front of the AKS cluster.

  • Use Kubernetes network policies to control which components can communicate with one another, which segregates and helps secure intraservice communications. By default, all pods in a Kubernetes cluster can send and receive traffic without limitations. To improve security, you can use Azure network policies or Calico network policies to define rules that control the traffic flow between various microservices. Use Azure network policies to enforce network-level access control. Use Calico network policies to implement fine-grained network segmentation and security policies in your AKS cluster. For more information, see Secure traffic between pods by using network policies in AKS.

  • Don't expose remote connectivity to your AKS nodes. Create an Azure Bastion host, or jumpbox, in a management virtual network. Use the Azure Bastion host to route traffic to your AKS cluster.

  • Consider using a private AKS cluster in your production environment. Or, at a minimum, use authorized IP address ranges in AKS to secure access to the API server. When you use authorized IP address ranges on a public cluster, allow all the egress IP addresses in the Azure firewall network rule collection. In-cluster operations consume the Kubernetes API server.

Cost Optimization

Cost Optimization is about looking at ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Design review checklist for Cost Optimization.

Operational Excellence

Operational Excellence covers the operations processes that deploy an application and keep it running in production. For more information, see Design review checklist for Operational Excellence.

DevOps

  • Use a Helm chart in a continuous integration and continuous delivery (CI/CD) pipeline to deploy your workloads to AKS.

  • Use A/B testing and canary deployments in your application lifecycle management to properly test an application before you make it available to users.

  • Use Azure Container Registry or a non-Microsoft registry, such as Harbor or Docker Hub, to store private container images that are deployed to the cluster.

  • Test ingress and egress on your workloads in a separate preproduction environment that mirrors the network topology and firewall rules of your production environment.

Monitoring

  • Use container insights to monitor the health status of the AKS cluster and workloads.

  • Use managed service for Prometheus to collect and analyze metrics at scale by using a Prometheus-compatible monitoring solution that's based on the Prometheus project from Cloud Native Computing Foundation.

  • Connect your managed service for Prometheus to an Azure Managed Grafana instance to use it as a data source in a Grafana dashboard. You then have access to multiple prebuilt dashboards that use Prometheus metrics, and you can create custom dashboards.

  • Configure all PaaS services, such as Container Registry and Key Vault, to collect diagnostic logs and metrics in an Azure Monitor Logs workspace.

Deploy this scenario

The source code for this scenario is available in GitHub. This open-source solution is licensed under the MIT License.

Prerequisites

Deployment to Azure

  1. Clone the workbench GitHub repository.

    git clone https://github.com/Azure-Samples/aks-front-door-end-to-end-tls.git
    
  2. Follow the instructions in the README file. You need your Azure subscription information for this step.

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal author:

To see non-public LinkedIn profiles, sign in to LinkedIn.

Next steps

Review the recommendations and best practices for AKS in the Microsoft Azure Well-Architected Framework:

Azure Front Door

AKS

Architectural guidance

Reference architectures