Manage cluster configuration
Note
We will retire Azure HDInsight on AKS on January 31, 2025. Before January 31, 2025, you will need to migrate your workloads to Microsoft Fabric or an equivalent Azure product to avoid abrupt termination of your workloads. The remaining clusters on your subscription will be stopped and removed from the host.
Only basic support will be available until the retirement date.
Important
This feature is currently in preview. The Supplemental Terms of Use for Microsoft Azure Previews include more legal terms that apply to Azure features that are in beta, in preview, or otherwise not yet released into general availability. For information about this specific preview, see Azure HDInsight on AKS preview information. For questions or feature suggestions, please submit a request on AskHDInsight with the details and follow us for more updates on Azure HDInsight Community.
HDInsight on AKS allows you to tweak the configuration properties to improve performance of your cluster with certain settings. For example, usage or memory settings. You can do the following actions:
- Update the existing configurations or add new configurations.
- Export the configurations using REST API.
Customize configurations
You can customize configurations using following options:
Using Azure portal
Sign in to Azure portal.
In the Azure portal search bar, type "HDInsight on AKS cluster" and select "Azure HDInsight on AKS clusters" from the drop-down list.
Select your cluster name from the list page.
Go to "Configuration management" blade in the left menu.
Depending on the cluster type, configurations files are listed. For more information, see Trino, Flink, and Spark configurations.
Add new or update the existing key-value pair for the configurations you want to modify.
Select OK and then click Save.
Note
Some configuration change may need service restart to reflect the changes.
Using ARM template
Prerequisites
- ARM template for your cluster.
- Familiarity with ARM template authoring and deployment.
In the ARM template, you can edit serviceConfigsProfiles and specify the OSS configuration file name with the value you would like to overwrite.
If the OSS configuration file is in JSON/XML/YAML format, you can provide the OSS configuration file name via fileName
. Provide the key value pairs that you want to overwrite in “values.”
Here are some samples for each workload:
Flink configuration example:
"serviceConfigsProfiles": [
{
"serviceName": "flink-operator",
"configs": [
{
"component": "flink-configs",
"files": [
{
"fileName": "flink-conf.yaml",
"values": {
"taskmanager.memory.process.size": "4096mb",
"classloader.check-leaked-classloader": "false",
"jobmanager.memory.process.size": "4096mb",
"classloader.parent-first-patterns.additional": "org.apache.parquet"
}
}
]
}
]
}
]
Spark configuration example:
"serviceConfigsProfiles": [
{
"serviceName": "spark-service",
"configs": [
{
"component": "livy-config",
"files": [
{
"fileName": "livy-client.conf",
"values": {
"livy.client.http.connection.timeout": "11s"
}
}
]
},
{
"component": "spark-config",
"files": [
{
"fileName": "spark-env.sh",
"content": "# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. \"-Dx=y\")\nexport HDP_VERSION=3.3.3.5.2-83515052\n"
}
]
}
]
}
]
Trino configuration example:
"serviceConfigsProfiles": [
{
"serviceName": "trino",
"configs": [
{
"component": "coordinator",
"files": [
{
"fileName": "config.properties",
"values": {
"query.cache.enabled": "true",
"query.cache.ttl": "1h",
"query.enable-multi-statement-set-session": "true",
"query.max-memory": "301GB"
}
},
{
"fileName": "log.properties",
"values": {
"io.trino": "INFO"
}
}
]
}
]
For more information about Trino configuration options, see the sample ARM templates.
Export the configurations using REST API
You can also export cluster configurations to check the default and updated values. At this time, you can only export configurations via the REST API.
Get cluster configurations:
GET Request: /subscriptions/{{USER_SUB}}/resourceGroups/{{USER_RG}}/providers/Microsoft.HDInsight/clusterpools/{{CLUSTERPOOL}}/clusters/{{CLUSTERNAME}}/serviceConfigs?api-version={{HDINSIGHTONAKS_API_VERSION}}
If you aren't familiar with how to send a REST API call, the following steps can help you.
Click the following button at the top right in the Azure portal to launch Azure Cloud Shell.
Make sure Cloud Shell is set to PowerShell on the top left. Run the following command to get token and set HTTP request headers.
$azContext = Get-AzContext $azProfile = [Microsoft.Azure.Commands.Common.Authentication.Abstractions.AzureRmProfileProvider]::Instance.Profile $profileClient = New-Object -TypeName Microsoft.Azure.Commands.ResourceManager.Common.RMProfileClient -ArgumentList ($azProfile) $token = $profileClient.AcquireAccessToken($azContext.Subscription.TenantId) $authHeader = @{ 'Content-Type'='application/json' 'Authorization'='Bearer ' + $token.AccessToken }
Set the $restUri variable to the Get request URL.
$restUri = 'https://management.azure.com/subscriptions/{{USER_SUB}}/resourceGroups/{{USER_RG}}/providers/Microsoft.HDInsight/clusterpools/{{CLUSTERPOOL}}/clusters/{{CLUSTERNAME}}/serviceConfigs?api-version={{HDINSIGHTONAKS_API_VERSION}}'
For example:
$restUri = 'https://management.azure.com/subscriptions/xxx-yyyy-zzz-00000/resourceGroups/contosoRG/providers/Microsoft.HDInsight/clusterpools/contosopool/clusters/contosocluster/serviceConfigs?api-version=2021-09-15-preview
Note
You can get the resource id and up-to-date api-version from the "JSON View" of your cluster in the Azure portal.
Send the GET request by executing following command.
Invoke-RestMethod -Uri $restUri -Method Get -Headers $authHeader | ConvertTo-Json -Depth 10