Overview of enterprise security in Azure HDInsight on AKS

Note

We will retire Azure HDInsight on AKS on January 31, 2025. Before January 31, 2025, you will need to migrate your workloads to Microsoft Fabric or an equivalent Azure product to avoid abrupt termination of your workloads. The remaining clusters on your subscription will be stopped and removed from the host.

Only basic support will be available until the retirement date.

Important

This feature is currently in preview. The Supplemental Terms of Use for Microsoft Azure Previews include more legal terms that apply to Azure features that are in beta, in preview, or otherwise not yet released into general availability. For information about this specific preview, see Azure HDInsight on AKS preview information. For questions or feature suggestions, please submit a request on AskHDInsight with the details and follow us for more updates on Azure HDInsight Community.

Azure HDInsight on AKS offers security by default, and there are several methods to address your enterprise security needs.

This article covers overall security architecture, and security solutions by dividing them into four traditional security pillars: perimeter security, authentication, authorization, and encryption.

Security architecture

Enterprise readiness for any software requires stringent security checks to prevent and address threats that may arise. HDInsight on AKS provides a multi-layered security model to protect you on multiple layers. The security architecture uses modern authorization methods using MSI. All the storage access is through MSI, and the database access is through username/password. The password is stored in Azure Key Vault, defined by the customer. This feature makes the setup robust and secure by default.

The below diagram illustrates a high-level technical architecture of security in HDInsight on AKS.

Screenshot showing the security flow of authenticating a cluster.

Enterprise security pillars

One way of looking at enterprise security is to divide security solutions into four main groups based on the type of control. These groups are also called security pillars and are of the following types: perimeter security, authentication, authorization, and encryption.

Perimeter security

Perimeter security in HDInsight on AKS is achieved through virtual networks. An enterprise admin can create a cluster inside a virtual network (VNET) and use network security groups (NSG) to restrict access to the virtual network.

Authentication

HDInsight on AKS provides Microsoft Entra ID-based authentication for cluster login and uses managed identities (MSI) to secure cluster access to files in Azure Data Lake Storage Gen2. Managed identity is a feature of Microsoft Entra ID that provides Azure services with a set of automatically managed credentials. With this setup, enterprise employees can sign into the cluster nodes by using their domain credentials. A managed identity from Microsoft Entra ID allows your app to easily access other Microsoft Entra protected resources such as Azure Key Vault, Storage, SQL Server, and Database. The identity managed by the Azure platform and doesn't require you to provision or rotate any secrets. This solution is a key for securing access to your HDInsight on AKS cluster and other dependent resources. Managed identities make your app more secure by eliminating secrets from your app, such as credentials in the connection strings.

You create a user-assigned managed identity, which is a standalone Azure resource, as part of the cluster creation process, which manages the access to your dependent resources.

Authorization

A best practice most enterprises follow is making sure that not every employee has full access to all enterprise resources. Likewise, the admin can define role-based access control policies for the cluster resources.

The resource owners can configure role-based access control (RBAC). Configuring RBAC policies allows you to associate permissions with a role in the organization. This layer of abstraction makes it easier to ensure people have only the permissions needed to perform their work responsibilities. Authorization managed by ARM roles for cluster management (control plane) and cluster data access (data plane) managed by cluster access management.

Cluster management roles (Control Plane / ARM Roles)

Action HDInsight on AKS Cluster Pool Admin HDInsight on AKS Cluster Admin
Create / Delete cluster pool
Assign permission and roles on the cluster pool
Create/delete cluster
Manage Cluster
Configuration Management
Script actions
Library Management
Monitoring
Scaling actions

The above roles are from the ARM operations perspective. For more information, see Grant a user access to Azure resources using the Azure portal - Azure RBAC.

Cluster access (Data Plane)

You can allow users, service principals, managed identity to access the cluster through portal or using ARM.

This access enables

  • View clusters, and manage jobs.
  • Perform all the monitoring and management operations.
  • Perform auto scale operations and update the node count.

The access not provided for

  • Cluster deletion

Screenshot showing the cluster data access.

Important

Any newly added user will require additional role of “Azure Kubernetes Service RBAC Reader” for viewing the service health.

Auditing

Auditing cluster resource access is necessary to track unauthorized or unintentional access of the resources. It's as important as protecting the cluster resources from unauthorized access.

The resource group admin can view and report all access to the HDInsight on AKS cluster resources and data using activity log. The admin can view and report changes to the access control policies.

Encryption

Protecting data is important for meeting organizational security and compliance requirements. Along with restricting access to data from unauthorized employees, you should encrypt it. The storage and the disks (OS disk and persistent data disk) used by the cluster nodes and containers are encrypted. Data in Azure Storage is encrypted and decrypted transparently using 256-bit AES encryption, one of the strongest block ciphers available, and is FIPS 140-2 compliant. Azure Storage encryption is enabled for all storage accounts, which makes data secure by default, you don't need to modify your code or applications to take advantage of Azure Storage encryption. Encryption of data in transit is handled with TLS 1.2.

Compliance

Azure compliance offerings are based on various types of assurances, including formal certifications. Also, attestations, validations, and authorizations. Assessments produced by independent third-party auditing firms. Contractual amendments, self-assessments, and customer guidance documents produced by Microsoft. For HDInsight on AKS compliance information, see the Microsoft Trust Center and the Overview of Microsoft Azure compliance.

Shared responsibility model

The following image summarizes the major system security areas and the security solutions that are available to you. It also highlights which security areas are your responsibilities as a customer and areas that are responsibility of HDInsight on AKS as the service provider.

Screenshot showing the shared responsibility model.

The following table provides links to resources for each type of security solution.

Security area Solutions available Responsible party
Data Access Security Configure access control lists ACLs for Azure Data Lake Storage Gen2 Customer
Enable the Secure transfer required property on storage Customer
Configure Azure Storage firewalls and virtual networks Customer
Operating system security Create clusters with most recent HDInsight on AKS versions Customer
Network security Configure a virtual network
Configure Traffic using Firewall rules Customer
Configure Outbound traffic required Customer