Quickstart: Create a Kubernetes cluster with Azure Kubernetes Service using Terraform

Azure Kubernetes Service (AKS) manages your hosted Kubernetes environment. AKS allows you to deploy and manage containerized applications without container orchestration expertise. AKS also enables you to do many common maintenance operations without taking your app offline. These operations include provisioning, upgrading, and scaling resources on demand.

This article shows how to create a Kubernetes cluster with Azure Kubernetes Service (AKS) using Terraform. The sample code is fully encapsulated such that it automatically creates a service principal and SSH key pair (using the AzAPI provider).

In this article, you learn how to:

Note

This article was partially created with the help of artificial intelligence. Before publishing, an author reviewed and revised the content as needed. See Our principles for using AI-generated content in Microsoft Learn.

Prerequisites

Implement the Terraform code

Note

The sample code for this article is located in the Azure Terraform GitHub repo. You can view the log file containing the test results from current and previous versions of Terraform.

See more articles and sample code showing how to use Terraform to manage Azure resources

  1. Create a directory in which to test the sample Terraform code and make it the current directory.

  2. Create a file named providers.tf and insert the following code:

    terraform {
      required_version = ">=1.0"
    
      required_providers {
        azapi = {
          source  = "azure/azapi"
          version = "~>1.5"
        }
        azurerm = {
          source  = "hashicorp/azurerm"
          version = "~>3.0"
        }
        random = {
          source  = "hashicorp/random"
          version = "~>3.0"
        }
        time = {
          source  = "hashicorp/time"
          version = "0.9.1"
        }
      }
    }
    
    provider "azurerm" {
      features {}
    }
    
  3. Create a file named ssh.tf and insert the following code:

    resource "random_pet" "ssh_key_name" {
      prefix    = "ssh"
      separator = ""
    }
    
    resource "azapi_resource" "ssh_public_key" {
      type      = "Microsoft.Compute/sshPublicKeys@2022-11-01"
      name      = random_pet.ssh_key_name.id
      location  = "westus3"
      parent_id = azurerm_resource_group.rg.id
    }
    
    resource "azapi_resource_action" "ssh_public_key_gen" {
      type        = "Microsoft.Compute/sshPublicKeys@2022-11-01"
      resource_id = azapi_resource.ssh_public_key.id
      action      = "generateKeyPair"
      method      = "POST"
    
      response_export_values = ["publicKey"]
    }
    
    output "key_data" {
      value     = azapi_resource.ssh_public_key.body
      sensitive = true
    }
    
  4. Create a file named main.tf and insert the following code:

    # Generate random resource group name
    resource "random_pet" "rg_name" {
      prefix = var.resource_group_name_prefix
    }
    
    resource "azurerm_resource_group" "rg" {
      location = var.resource_group_location
      name     = random_pet.rg_name.id
    }
    
    resource "random_pet" "azurerm_kubernetes_cluster_name" {
      prefix = "cluster"
    }
    
    resource "random_pet" "azurerm_kubernetes_cluster_dns_prefix" {
      prefix = "dns"
    }
    
    resource "azurerm_kubernetes_cluster" "k8s" {
      location            = azurerm_resource_group.rg.location
      name                = random_pet.azurerm_kubernetes_cluster_name.id
      resource_group_name = azurerm_resource_group.rg.name
      dns_prefix          = random_pet.azurerm_kubernetes_cluster_dns_prefix.id
    
      identity {
        type = "SystemAssigned"
      }
    
      default_node_pool {
        name       = "agentpool"
        vm_size    = "Standard_D2_v2"
        node_count = var.node_count
      }
      linux_profile {
        admin_username = "ubuntu"
    
        ssh_key {
          key_data = jsondecode(azapi_resource_action.ssh_public_key_gen.output).publicKey
        }
      }
      network_profile {
        network_plugin    = "kubenet"
        load_balancer_sku = "standard"
      }
    }
    
  5. Create a file named variables.tf and insert the following code:

    variable "resource_group_location" {
      type        = string
      default     = "eastus"
      description = "Location of the resource group."
    }
    
    variable "resource_group_name_prefix" {
      type        = string
      default     = "rg"
      description = "Prefix of the resource group name that's combined with a random ID so name is unique in your Azure subscription."
    }
    
    variable "node_count" {
      type        = number
      description = "The initial quantity of nodes for the node pool."
      default     = 3
    }
    
    variable "msi_id" {
      type        = string
      description = "The Managed Service Identity ID. Set this value if you're running this example using Managed Identity as the authentication method."
      default     = null
    }
    
  6. Create a file named outputs.tf and insert the following code:

    output "resource_group_name" {
      value = azurerm_resource_group.rg.name
    }
    
    output "kubernetes_cluster_name" {
      value = azurerm_kubernetes_cluster.k8s.name
    }
    
    output "client_certificate" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config[0].client_certificate
      sensitive = true
    }
    
    output "client_key" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config[0].client_key
      sensitive = true
    }
    
    output "cluster_ca_certificate" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config[0].cluster_ca_certificate
      sensitive = true
    }
    
    output "cluster_password" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config[0].password
      sensitive = true
    }
    
    output "cluster_username" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config[0].username
      sensitive = true
    }
    
    output "host" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config[0].host
      sensitive = true
    }
    
    output "kube_config" {
      value     = azurerm_kubernetes_cluster.k8s.kube_config_raw
      sensitive = true
    }
    

Initialize Terraform

Run terraform init to initialize the Terraform deployment. This command downloads the Azure provider required to manage your Azure resources.

terraform init -upgrade

Key points:

  • The -upgrade parameter upgrades the necessary provider plugins to the newest version that complies with the configuration's version constraints.

Create a Terraform execution plan

Run terraform plan to create an execution plan.

terraform plan -out main.tfplan

Key points:

  • The terraform plan command creates an execution plan, but doesn't execute it. Instead, it determines what actions are necessary to create the configuration specified in your configuration files. This pattern allows you to verify whether the execution plan matches your expectations before making any changes to actual resources.
  • The optional -out parameter allows you to specify an output file for the plan. Using the -out parameter ensures that the plan you reviewed is exactly what is applied.
  • To read more about persisting execution plans and security, see the security warning section.

Apply a Terraform execution plan

Run terraform apply to apply the execution plan to your cloud infrastructure.

terraform apply main.tfplan

Key points:

  • The example terraform apply command assumes you previously ran terraform plan -out main.tfplan.
  • If you specified a different filename for the -out parameter, use that same filename in the call to terraform apply.
  • If you didn't use the -out parameter, call terraform apply without any parameters.

Verify the results

  1. Get the Azure resource group name.

    resource_group_name=$(terraform output -raw resource_group_name)
    
  2. Run az aks list to display the name of the new Kubernetes cluster.

    az aks list \
      --resource-group $resource_group_name \
      --query "[].{\"K8s cluster name\":name}" \
      --output table
    
  3. Get the Kubernetes configuration from the Terraform state and store it in a file that kubectl can read.

    echo "$(terraform output kube_config)" > ./azurek8s
    
  4. Verify the previous command didn't add an ASCII EOT character.

    cat ./azurek8s
    

    Key points:

    • If you see << EOT at the beginning and EOT at the end, remove these characters from the file. Otherwise, you could receive the following error message: error: error loading config file "./azurek8s": yaml: line 2: mapping values are not allowed in this context
  5. Set an environment variable so that kubectl picks up the correct config.

    export KUBECONFIG=./azurek8s
    
  6. Verify the health of the cluster.

    kubectl get nodes
    

    The kubectl tool allows you to verify the health of your Kubernetes cluster

Key points:

  • When the AKS cluster was created, monitoring was enabled to capture health metrics for both the cluster nodes and pods. These health metrics are available in the Azure portal. For more information on container health monitoring, see Monitor Azure Kubernetes Service health.
  • Several key values were output when you applied the Terraform execution plan. For example, the host address, AKS cluster user name, and AKS cluster password are output.

Clean up resources

Delete AKS resources

When you no longer need the resources created via Terraform, do the following steps:

  1. Run terraform plan and specify the destroy flag.

    terraform plan -destroy -out main.destroy.tfplan
    

    Key points:

    • The terraform plan command creates an execution plan, but doesn't execute it. Instead, it determines what actions are necessary to create the configuration specified in your configuration files. This pattern allows you to verify whether the execution plan matches your expectations before making any changes to actual resources.
    • The optional -out parameter allows you to specify an output file for the plan. Using the -out parameter ensures that the plan you reviewed is exactly what is applied.
    • To read more about persisting execution plans and security, see the security warning section.
  2. Run terraform apply to apply the execution plan.

    terraform apply main.destroy.tfplan
    

Delete service principal

  1. Get the service principal ID.

    sp=$(terraform output -raw sp)
    
  2. Run az ad sp delete to delete the service principal.

    az ad sp delete --id $sp
    

Troubleshoot Terraform on Azure

Troubleshoot common problems when using Terraform on Azure

Next steps