Application data protection for AKS workloads on Azure NetApp Files

Azure NetApp Files
Azure Kubernetes Service (AKS)
Azure Virtual Network

This article outlines a solution for managing and performing application data management of stateful containerized applications, their resources, and their data.

Architecture

Architecture diagram that shows how to deploy AKS with Astra Control Service when AKS and Azure NetApp Files are in separate virtual networks.

Download a Visio file of this architecture.

Dataflow

  1. An Azure NetApp Files account is created on an Azure subscription, and capacity pools are defined. These pools map to service levels that the implementation needs, such as Standard, Premium, and Ultra.

  2. One or more AKS clusters are deployed. The clusters need to be:

  3. A user signs up for an Astra Control Service account. Astra Control Service uses an Azure service principal credential that has contributor access to locate the AKS clusters to be managed. Astra Control Service installs Astra Trident and creates StorageClasses mapped to each tier of service when a cluster is added to Astra Control Service. Astra Trident creates Kubernetes PersistentVolumes (PVs) from application PersistentVolumeClaims (PVCs) using the automatically deployed StorageClass (SC) objects that map to the Azure NetApp Files capacity pools. The mapping takes into account the service level of the capacity pools.

  4. The user installs applications on the AKS clusters. Possible deployment methods include Helm charts, operators, and YAML manifests. The applications can be grouped by labels or namespaces. Astra Trident provisions persistent volumes based on the PersistentVolumeClaims using the StorageClass objects.

  5. Astra Control Service manages applications and their associated resources, such as pods, services, deployments, and PersistentVolumeClaim (PVC) objects. It also manages the PersistentVolume (PV) bound to the PVC. Users define applications by using one of these methods:

    • Confining them to a namespace
    • Using a custom Kubernetes label to group resources

Users can also group cluster-scoped objects, such as storageclasses, with (a) specific application(s) to manage them together.

Astra Control Service orchestrates point-in-time snapshots and backups, backup policies, and instant active clones to help protect application workloads. Astra Control Service achieves this protection by:

  • Creating Astra Control Service protection policies. These can be made for snapshots and/or backups and specify a schedule and backup target. These policies make it possible to automatically protect applications on a pre-determined schedule.

  • Taking snapshots on demand for individual or a group of applications.

  • Making instantaneous backups or clones for individual or a group of applications.

    When disasters or app failures occur, backups and snapshots restore applications' state. Users can clone and migrate apps across namespaces and AKS clusters. The clusters can be in the same or separate regions.

Components

  • AKS is a fully managed Kubernetes service that makes it easy to deploy and manage containerized applications. AKS offers serverless Kubernetes technology, an integrated continuous integration and continuous delivery (CI/CD) experience, and enterprise-grade security and governance.
  • Azure NetApp Files is an Azure storage service. This service provides enterprise-grade network file system (NFS) and server message block (SMB) file shares. Azure NetApp Files makes it easy to migrate and run complex, file-based applications with no code changes. This service is well suited for users with persistent volumes in Kubernetes environments.
  • Azure Virtual Network is the fundamental building block for private networks in Azure. Through Virtual Network, Azure resources like virtual machines can securely communicate with each other, the internet, and on-premises networks.
  • Astra Control Service is a fully managed application-aware data management service. Astra Control Service helps you manage, protect, and move data-rich Kubernetes workloads in public clouds and on-premises environments. This service provides data protection, disaster recovery, and migration for Kubernetes workloads. Astra Control Service uses the industry-leading data management technology of Azure NetApp Files for snapshots, backups, cross-region replication, and cloning.

Alternatives

You can use a custom multi-pronged approach to separately back up or replicate persistent volumes, Kubernetes resources, and other configuration state resources that you need when you restore an application. But this approach can be:

  • Cumbersome.
  • Difficult to make compatible with all apps.
  • Difficult to scale across the multiple apps and environments that a typical enterprise has.

In certain environments, you can reduce costs by avoiding cross-peered virtual network traffic. To eliminate this traffic, simplify the solution. Specifically, bring the AKS clusters and the subnet that you delegate for Azure NetApp Files into the same virtual network, as this diagram illustrates:

Architecture diagram that shows how to use AKS with Astra Control Service in a single virtual network.

Download a Visio file of this architecture.

Scenario details

With containerized applications, it can be challenging to perform application-data protection. The application consists of multiple microservices, which must be managed as one entity. When you deploy business-critical workloads on Kubernetes, application data management should be:

  • Simple. Establishing data protection policies and on-demand snapshots and backups should be intuitive. These policies shouldn't be dependent on the details of the underlying infrastructure.
  • Portable. To make cross-region mobility possible for applications, multiple Kubernetes clusters should be able to consume the backups.
  • Application-aware. Your solution should protect the entire application, including standard Kubernetes resources like secrets, ConfigMap objects, and persistent volumes. You also need to protect custom Kubernetes resources. When possible, procedures should quiesce the application prior to the snapshot and backup. This practice prevents the loss of in-flight data during backups.

NetApp Astra Control Service is a solution for performing stateful application data management that helps you meet these goals. Astra Control Service offers data protection, disaster recovery, and application mobility capabilities. It provides stateful AKS workloads with a rich set of storage and application-aware data management services. The data protection technology of Azure NetApp Files underlies these services.

Potential use cases

This solution applies to systems that run stateful applications:

  • Continuous integration (CI) systems such as Jenkins
  • Database workloads like MySQL, MongoDB, and PostgreSQL
  • AI and machine-learning components such as TensorFlow and PyTorch
  • Elasticsearch deployments
  • Kafka applications
  • Source code management platforms like GitLab

Considerations

These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that can be used to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.

Reliability

Reliability ensures your application can meet the commitments you make to your customers. For more information, see Overview of the reliability pillar.

When you deploy an AKS cluster, you deploy it in a single region. To protect application workloads, it's best to deploy the workloads across multiple AKS clusters that span multiple regions. Factors that affect deployment include AKS region availability and Azure paired regions. When you deploy clusters across multiple availability zones, you distribute nodes across multiple zones within a single region. This distribution of AKS cluster resources improves cluster availability because the clusters are resilient to the failure of a specific zone.

Azure NetApp Files is highly available by design. It's built on a highly available bare-metal fleet of all flash storage systems. For this service's availability guarantee, see SLA for Azure NetApp Files.

Azure NetApp Files supports cross-region replication for disaster recovery. You can replicate volumes between Azure region pairs continuously. For more information about cross-region replication, see these resources:

Cost optimization

Cost optimization is about looking at ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Overview of the cost optimization pillar.

Use the Azure Pricing calculator to estimate the cost of the following components:

  • AKS
  • Azure NetApp Files
  • Virtual Network

For Astra Control Service pricing plans, see Pricing. By adopting Astra Control Service, you can focus on your application instead of spending time and resources building custom solutions that don't scale. Astra Control Service is available on Azure Marketplace.

To run detailed bandwidth and pricing calculations, use the Azure NetApp Files Performance Calculator. Basic and advanced calculators are available.

Operational excellence

Operational excellence covers the operations processes that deploy an application and keep it running in production. For more information, see Overview of the operational excellence pillar.

When you work with the Kubernetes control plane, it's important to monitor your infrastructure and platform layer. Astra Control Service provides a unified control plane that you can use to define and manage application protection policies across multiple AKS clusters. A dashboard provides a way for you to continuously handle workloads across regions. Astra Trident also provides a rich set of Prometheus metrics that you can use to monitor provisioned storage.

Performance efficiency

Performance efficiency is the ability of your workload to scale to meet the demands placed on it by users in an efficient manner. For more information, see Performance efficiency pillar overview.

AKS clusters can add extra worker nodes to increase scalability. To scale your solution, you can add node pools or scale existing node pools. These steps increase the number of nodes in your cluster, the total number of cores, and the memory that's available for your containerized applications.

In each virtual network, you can only delegate one subnet for Azure NetApp Files.

When you use a basic configuration for Azure NetApp Files network features, there's a limit of 1,000 IP addresses per virtual network. The standard network features configuration doesn't limit the number of IP addresses. For more information, see Configurable network features. For a complete list of resource limits for Azure NetApp Files, see Resource limits for Azure NetApp Files.

Azure NetApp Files offers multiple performance tiers. When you use Astra Control Service to discover AKS clusters, the onboarding process creates curated StorageClass objects that map to the Standard, Premium, and Ultra service tiers. When users deploy applications, they choose a storage tier that suits their requirements. Multiple capacity pools can coexist. Provisioned volumes have a performance guarantee that corresponds to the service tier. For a list of service levels that Azure NetApp Files supports, see Service levels for Azure NetApp Files.

Deploy this scenario

To implement this solution, you need an Azure account. Create an account for free.

To deploy this scenario, follow these steps:

  1. Register the resource provider that makes it possible to use Azure NetApp Files.
  2. Review the Requirements for using Astra Control Service with AKS.
  3. Use the Azure portal to create a NetApp account.
  4. Set up capacity pools on the Azure NetApp Files account.
  5. Delegate a subnet for Azure NetApp Files.
  6. Create a service principal for Astra Control Service to use to discover AKS clusters and perform backup, restore, and data management operations.
  7. Register for Astra Control Service by creating a NetApp Cloud Central account.
  8. Add AKS clusters to Astra Control Service to start managing applications.
  9. Detect applications in Astra Control Service. The way you discover and manage applications depends on the way you deploy and identify them. Typical identification strategies include grouping application objects in a dedicated namespace, assigning labels to objects that make up an application, and using Helm charts. Astra Control Service supports all three strategies.
  10. Establish protection policies to back up and restore applications. Before you define protection policies, clearly identify your workloads. A prerequisite is that Astra Control Service can uniquely detect each application. For more information, see Start managing apps.

For steps that you can take to help protect applications, see Disaster Recovery of AKS Workloads with Astra Control Service and Azure NetApp Files.

For detailed information about Astra Control Service, see Astra Control Service documentation.

Contributors

This article is maintained by Microsoft. It was originally written by the following contributor.

Principal author:

Next steps