Use Azure Monitor logs to monitor HDInsight clusters

Learn how to enable Azure Monitor logs to monitor Hadoop cluster operations in HDInsight. And how to add a HDInsight monitoring solution.

Azure Monitor logs is an Azure Monitor service that monitors your cloud and on-premises environments. The monitoring is to maintain their availability and performance. It collects data generated by resources in your cloud, on-premises environments and from other monitoring tools. The data is used to provide analysis across multiple sources.

Note

This article was recently updated to use the term Azure Monitor logs instead of Log Analytics. Log data is still stored in a Log Analytics workspace and is still collected and analyzed by the same Log Analytics service. We are updating the terminology to better reflect the role of logs in Azure Monitor. See Azure Monitor terminology changes for details.

If you don't have an Azure subscription, create a free account before you begin.

Important

Azure Monitor experience (preview) in HDInsight is retiring by February 1, 2025. For more information, see Retirement: Azure Monitor experience (preview) in HDInsight is retiring by February 1, 2025.

Prerequisites

  • A Log Analytics workspace. You can think of this workspace as a unique Azure Monitor logs environment with its own data repository, data sources, and solutions. For the instructions, see Create a Log Analytics workspace.

  • An Azure HDInsight cluster. Currently, you can use Azure Monitor logs with the following HDInsight cluster types:

    • Hadoop
    • HBase
    • Interactive Query
    • Kafka
    • Spark

    For the instructions on how to create a HDInsight cluster, see Get started with Azure HDInsight.

  • If using PowerShell, you need the Az Module. Ensure you have the latest version. If necessary, run Update-Module -Name Az.

  • If wanting to use Azure CLI and you haven't yet installed it, see Install the Azure CLI.

Note

New Azure Monitor experience is only available in all the regions as a preview feature. It is recommended to place both the HDInsight cluster and the Log Analytics workspace in the same region for better performance.

Enable Azure Monitor using the portal

In this section, you configure an existing HDInsight Hadoop cluster to use an Azure Log Analytics workspace to monitor jobs, debug logs, and so on.

  1. From the Azure portal, select your cluster. The cluster is opened in a new portal page.

  2. From the left, under Monitoring, select Monitor Integration.

  3. From the main view, under Azure Monitor for HDInsight Clusters Integration, select Enable.

  4. From the Select a workspace drop-down list, select an existing Log Analytics workspace.

  5. Select Save. It takes a few moments to save the setting.

    Enable monitoring for HDInsight clusters.

If you want to disable Azure Monitor, you can do the same in this portal.

Enable Azure Monitor using Azure PowerShell

You can enable Azure Monitor logs using the Azure PowerShell Az module Enable-AzHDInsightAzureMonitor cmdlet.

# Enter user information
$resourceGroup = "<your-resource-group>"
$cluster = "<your-cluster>"
$LAW = "<your-Log-Analytics-workspace>"
# End of user input

# obtain workspace id for defined Log Analytics workspace
$WorkspaceId = (Get-AzOperationalInsightsWorkspace `
                    -ResourceGroupName $resourceGroup `
                    -Name $LAW).CustomerId

# obtain primary key for defined Log Analytics workspace
$PrimaryKey = (Get-AzOperationalInsightsWorkspace `
                    -ResourceGroupName $resourceGroup `
                    -Name $LAW | Get-AzOperationalInsightsWorkspaceSharedKeys).PrimarySharedKey

# Enables monitoring and relevant logs will be sent to the specified workspace.
Enable-AzHDInsightAzureMonitor `
    -ResourceGroupName $resourceGroup `
    -ClusterName $cluster `
    -WorkspaceId $WorkspaceId `
    -PrimaryKey $PrimaryKey

# Gets the status of monitoring installation on the cluster.
Get-AzHDInsightAzureMonitor `
    -ResourceGroupName $resourceGroup `
    -ClusterName $cluster

To disable, the use the Disable-AzHDInsightAzureMonitor cmdlet:

Disable-AzHDInsightAzureMonitor -ResourceGroupName $resourceGroup `
-ClusterName $cluster

Enable Azure Monitor using Azure CLI

You can enable Azure Monitor logs using the Azure CLI az hdinsight azure-monitor enable command.

# set variables
export resourceGroup=RESOURCEGROUPNAME
export cluster=CLUSTERNAME
export LAW=LOGANALYTICSWORKSPACENAME

# Enable the Azure Monitor logs integration on an HDInsight cluster.
az hdinsight azure-monitor enable --name $cluster --resource-group $resourceGroup --workspace $LAW

# Get the status of Azure Monitor logs integration on an HDInsight cluster.
az hdinsight azure-monitor show --name $cluster --resource-group $resourceGroup

To disable, the use the az hdinsight monitor disable command.

az hdinsight azure-monitor disable --name $cluster --resource-group $resourceGroup

Use HDInsight out-of-box Insights to monitor a single cluster

HDInsight provides workload-specific workbook to help you quickly get insights. This workbook collects important performance metrics from your HDInsight cluster and provides the visualizations and dashboards for most common scenarios. The out-of-box insights give a complete view of a single HDInsight cluster including resource utilization and application status.

Available HDInsight workbooks:

  • HDInsight Spark Workbook
  • HDInsight Kafka Workbook
  • HDInsight HBase Workbook
  • HDInsight Hive/LLAP Workbook

Screenshot of Spark Workbook Spark workbook screenshot.

Configuring performance counters

Azure monitor supports collecting and analyzing performance metrics for the nodes in your cluster. For more information, see Linux performance data sources in Azure Monitor.

Cluster auditing

HDInsight support cluster auditing with Azure Monitor logs, by importing the following types of logs:

  • log_gateway_audit_CL - this table provides audit logs from cluster gateway nodes that show successful and failed sign-in attempts.
  • log_auth_CL - this table provides SSH logs with successful and failed sign-in attempts.
  • log_ambari_audit_CL - this table provides audit logs from Ambari.
  • ranger_audit_logs_CL - this table provides audit logs from Apache Ranger on ESP clusters.

For the log table mappings from the classic Azure Monitor integration to the new one, see Log table mapping.

Update the Log Analytics (OMS) Agent used by HDInsight Azure Monitor Integration

When Azure Monitor integration is enabled on a cluster, the Log Analytics agent, or Operations Management Suite (OMS) Agent, is installed on the cluster and is not updated unless you disable and re-enable Azure Monitor Integration. Complete the following steps if you need to update the OMS Agent on the cluster. If you are behind a firewall you may need to complete the Prerequisites for clusters behind a firewall before completing these steps.

  1. From the Azure portal, select your cluster. The cluster is opened in a new portal page.
  2. From the left, under Monitoring, select Azure Monitor.
  3. Note the name of your current Log Analytics workspace.
  4. From the main view, under Azure Monitor Integration, disable the toggle, and then select Save.
  5. After the setting saves, re-enable the Azure Monitor Integration toggle, and ensure the same Log Analytics workspace is selected, and then select Save.

If you have Azure Monitor Integration enabled on a cluster, updating the OMS agent will also update the Open Management Infrastructure (OMI) version. You can check the OMI version on the cluster by running the following command:

 sudo /opt/omi/bin/omiserver –version

Next steps