Operate Azure Arc-enabled data services with least privileges

Operating Arc-enabled data services with least privileges is a security best practice. Only grant users and service accounts the specific permissions required to perform the required tasks. Both Azure and Kubernetes provide a role-based access control model which can be used to grant these specific permissions. This article describes certain common scenarios in which the security of least privilege should be applied.

Note

In this article, a namespace name of arc will be used. If you choose to use a different name, then use the same name throughout. In this article, the kubectl CLI utility is used as the example. Any tool or system that uses the Kubernetes API can be used though.

Deploy the Azure Arc data controller

Deploying the Azure Arc data controller requires some permissions which can be considered high privilege such as creating a Kubernetes namespace or creating cluster role. The following steps can be followed to separate the deployment of the data controller into multiple steps, each of which can be performed by a user or a service account which has the required permissions. This separation of duties ensures that each user or service account in the process has just the permissions required and nothing more.

Deploy a namespace in which the data controller will be created

This step will create a new, dedicated Kubernetes namespace into which the Arc data controller will be deployed. It is essential to perform this step first, because the following steps will use this new namespace as a scope for the permissions that are being granted.

Permissions required to perform this action:

  • Namespace
    • Create
    • Edit (if required for OpenShift clusters)

Run a command similar to the following to create a new, dedicated namespace in which the data controller will be created.

kubectl create namespace arc

If you are using OpenShift, you will need to edit the openshift.io/sa.scc.supplemental-groups and openshift.io/sa.scc.uid-range annotations on the namespace using kubectl edit namespace <name of namespace>. Change these existing annotations to match these specific UID and fsGroup IDs/ranges.

openshift.io/sa.scc.supplemental-groups: 1000700001/10000
openshift.io/sa.scc.uid-range: 1000700001/10000

Assign permissions to the deploying service account and users/groups

This step will create a service account and assign roles and cluster roles to the service account so that the service account can be used in a job to deploy the Arc data controller with the least privileges required.

Permissions required to perform this action:

  • Service account
    • Create
  • Role
    • Create
  • Role binding
    • Create
  • Cluster role
    • Create
  • Cluster role binding
    • Create
  • All the permissions being granted to the service account (see the arcdata-deployer.yaml below for details)

Save a copy of arcdata-deployer.yaml, and replace the placeholder {{NAMESPACE}} in the file with the namespace created in the previous step, for example: arc. Run the following command to create the deployer service account with the edited file.

kubectl apply --namespace arc -f arcdata-deployer.yaml

Grant permissions to users to create the bootstrapper job and data controller

Permissions required to perform this action:

  • Role
    • Create
  • Role binding
    • Create

Save a copy of arcdata-installer.yaml, and replace the placeholder {{INSTALLER_USERNAME}} in the file with the name of the user to grant the permissions to, for example: john@contoso.com. Add additional role binding subjects such as other users or groups as needed. Run the following command to create the installer permissions with the edited file.

kubectl apply --namespace arc -f arcdata-installer.yaml

Deploy the bootstrapper job

Permissions required to perform this action:

  • User that is assigned to the arcdata-installer-role role in the previous step

Run the following command to create the bootstrapper job that will run preparatory steps to deploy the data controller.

kubectl apply --namespace arc -f https://raw.githubusercontent.com/microsoft/azure_arc/main/arc_data_services/deploy/yaml/bootstrapper.yaml

Create the Arc data controller

Now you are ready to create the data controller itself.

First, create a copy of the template file locally on your computer so that you can modify some of the settings.

Create the metrics and logs dashboards user names and passwords

At the top of the file, you can specify a user name and password that is used to authenticate to the metrics and logs dashboards as an administrator. Choose a secure password and share it with only those that need to have these privileges.

A Kubernetes secret is stored as a base64 encoded string - one for the username and one for the password.

echo -n '<your string to encode here>' | base64
# echo -n 'example' | base64

Optionally, you can create SSL/TLS certificates for the logs and metrics dashboards. Follow the instructions at Specify SSL/TLS certificates during Kubernetes native tools deployment.

Edit the data controller configuration

Edit the data controller configuration as needed:

REQUIRED

  • location: Change this to be the Azure location where the metadata about the data controller will be stored. Review the list of available regions.
  • logsui-certificate-secret: The name of the secret created on the Kubernetes cluster for the logs UI certificate.
  • metricsui-certificate-secret: The name of the secret created on the Kubernetes cluster for the metrics UI certificate.

Review these values, and update for your deployment:

  • storage..className: the storage class to use for the data controller data and log files. If you are unsure of the available storage classes in your Kubernetes cluster, you can run the following command: kubectl get storageclass. The default is default which assumes there is a storage class that exists and is named default not that there is a storage class that is the default. Note: There are two className settings to be set to the desired storage class - one for data and one for logs.

  • serviceType: Change the service type to NodePort if you are not using a LoadBalancer.

  • Security For Azure Red Hat OpenShift or Red Hat OpenShift Container Platform, replace the security: settings with the following values in the data controller yaml file.

    security:
      allowDumps: false
      allowNodeMetricsCollection: false
      allowPodMetricsCollection: false
    

Optional

The following settings are optional.

  • name: The default name of the data controller is arc, but you can change it if you want.
  • displayName: Set this to the same value as the name attribute at the top of the file.
  • registry: The Microsoft Container Registry is the default. If you are pulling the images from the Microsoft Container Registry and pushing them to a private container registry, enter the IP address or DNS name of your registry here.
  • dockerRegistry: The secret to use to pull the images from a private container registry if required.
  • repository: The default repository on the Microsoft Container Registry is arcdata. If you are using a private container registry, enter the path the folder/repository containing the Azure Arc-enabled data services container images.
  • imageTag: The current latest version tag is defaulted in the template, but you can change it if you want to use an older version.
  • logsui-certificate-secret: The name of the secret created on the Kubernetes cluster for the logs UI certificate.
  • metricsui-certificate-secret: The name of the secret created on the Kubernetes cluster for the metrics UI certificate.

The following example shows a completed data controller yaml.

apiVersion: v1
data:
  password: <your base64 encoded password>
  username: <your base64 encoded username>
kind: Secret
metadata:
  name: metricsui-admin-secret
type: Opaque

---

apiVersion: v1
data:
  password: <your base64 encoded password>
  username: <your base64 encoded username>
kind: Secret
metadata:
  name: logsui-admin-secret
type: Opaque

---

apiVersion: arcdata.microsoft.com/v5
kind: DataController
metadata:
  name: arc-dc
spec:
  credentials:
    dockerRegistry: arc-private-registry # Create a registry secret named 'arc-private-registry' if you are going to pull from a private registry instead of MCR.
    serviceAccount: sa-arc-controller
  docker:
    imagePullPolicy: Always
    imageTag: v1.29.0_2024-04-09
    registry: mcr.microsoft.com
    repository: arcdata
  infrastructure: other # Must be a value in the array [alibaba, aws, azure, gcp, onpremises, other]
  security:
    allowDumps: true # Set this to false if deploying on OpenShift
    allowNodeMetricsCollection: true # Set this to false if deploying on OpenShift
    allowPodMetricsCollection: true # Set this to false if deploying on OpenShift
  services:
  - name: controller
    port: 30080
    serviceType: LoadBalancer # Modify serviceType based on your Kubernetes environment
  settings:
    ElasticSearch:
      vm.max_map_count: "-1"
    azure:
      connectionMode: indirect # Only indirect is supported for Kubernetes-native deployment for now.
      location: eastus # Choose a different Azure location if you want
      resourceGroup: <your resource group>
      subscription: <your subscription GUID>
    controller:
      displayName: arc-dc
      enableBilling: true
      logs.rotation.days: "7"
      logs.rotation.size: "5000"
  storage:
    data:
      accessMode: ReadWriteOnce
      className: default # Use default configured storage class or modify storage class based on your Kubernetes environment
      size: 15Gi
    logs:
      accessMode: ReadWriteOnce
      className: default # Use default configured storage class or modify storage class based on your Kubernetes environment
      size: 10Gi

Save the edited file on your local computer and run the following command to create the data controller:

kubectl create --namespace arc -f <path to your data controller file>

#Example
kubectl create --namespace arc -f data-controller.yaml

Monitoring the creation status

Creating the controller will take a few minutes to complete. You can monitor the progress in another terminal window with the following commands:

kubectl get datacontroller --namespace arc
kubectl get pods --namespace arc

You can also check on the creation status or logs of any particular pod by running a command like below. This is especially useful for troubleshooting any issues.

kubectl describe pod/<pod name> --namespace arc
kubectl logs <pod name> --namespace arc

#Example:
#kubectl describe pod/control-2g7bl --namespace arc
#kubectl logs control-2g7b1 --namespace arc

You have several additional options for creating the Azure Arc data controller:

Just want to try things out? Get started quickly with Azure Arc Jumpstart on AKS, Amazon EKS, or GKE, or in an Azure VM.