Manage Apache Hadoop clusters in HDInsight by using the Azure portal
Article
Using the Azure portal, you can manage Apache Hadoop clusters in Azure HDInsight. Use the tab selector for information on managing Hadoop clusters in HDInsight using other tools.
Allows you to set key/value pairs to define a custom taxonomy of your cloud services. For example, you may create a key named project, and then use a common value for all services associated with a specific project.
Diagnose and solve problems
Display troubleshooting information.
Quickstart
Displays information that helps you get started using HDInsight.
Tools
Help information for HDInsight related tools.
Settings menu
Item
Description
Cluster size
Check, increase, and decrease the number of cluster worker nodes. See Scale clusters.
Quota limits
Display the used and available cores for your subscription.
SSH + Cluster login
Shows the instructions to connect to the cluster using Secure Shell (SSH) connection. For more information, see Use SSH with HDInsight.
Select Move to another resource group or Move to another subscription.
Follow the instructions from the new page.
Delete clusters
Deleting a cluster doesn't delete the default storage account nor any linked storage accounts. You can re-create the cluster by using the same storage accounts and the same metastores. We recommend using a new default Blob container when you re-create the cluster.
You can add additional Azure Storage accounts and Azure Data Lake Storage accounts after a cluster is created. For more information, see Add additional storage accounts to HDInsight.
Scale clusters
The cluster scaling feature allows you to change the number of worker nodes used by an Azure HDInsight cluster, without having to re-create the cluster.
Most of Hadoop jobs are batch jobs that are only run occasionally. For most Hadoop clusters, there are large periods of time that the cluster isn't being used for processing. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use.
You're also charged for an HDInsight cluster, even when it isn't in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
Ambari provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. Ambari enables system administrators to manage and monitor Hadoop clusters.
An HDInsight cluster can have two user accounts. The HDInsight cluster user account (HTTP user account) and the SSH user account are created during the creation process. You can use the portal to change the cluster user account password, and script actions to change the SSH user account.
Change the cluster user password
Note
Changing the cluster user (admin) password may cause script actions run against this cluster to fail. If you have any persisted script actions that target worker nodes, these scripts may fail when you add nodes to the cluster through resize operations. For more information on script actions, see Customize HDInsight clusters using script actions.
Upload the file to a storage location that can be accessed from HDInsight using an HTTP or HTTPS address. For example, a public file store such as OneDrive or Azure Blob storage. Save the URI (HTTP or HTTPS address) to the file, as this URI is needed in the next step.
From the Submit script action page, enter the following information:
Note
SSH passwords cannot contain the following characters:
" ' ` / \ < % ~ | $ & ! #
Field
Value
Script type
Select - Custom from the drop-down list.
Name
"Change ssh credentials"
Bash script URI
The URI to the changecredentials.sh file
Node type(s): (Head, Worker, Nimbus, Supervisor, or Zookeeper.)
✓ for all node types listed
Parameters
Enter the SSH user name and then the new password. There should be one space between the user name and the password.
Persist this script action ...
Leave this field unchecked.
Select Create to apply the script. Once the script finishes, you're able to connect to the cluster using SSH with the new credentials.
Find the subscription ID
Each cluster is tied to an Azure subscription. The Azure subscription ID is visible from the cluster home page.
Find the resource group
In the Azure Resource Manager mode, each HDInsight cluster is created with an Azure Resource Manager group. The Resource Manager group is visible from the cluster home page.
Find the storage accounts
HDInsight clusters use either an Azure Storage account or Azure Data Lake Storage to store data. Each HDInsight cluster can have one default storage account and a number of linked storage accounts. To list the storage accounts, from the cluster home page under Settings, select Storage accounts.
The Cluster size tile from the cluster home page displays the number of cores allocated to this cluster and how they're allocated for the nodes within this cluster.
Important
To monitor the services provided by the HDInsight cluster, you must use Ambari Web or the Ambari REST API. For more information on using Ambari, see Manage HDInsight clusters using Apache Ambari
Azure HPC is a purpose-built cloud capability for HPC & AI workload, using leading-edge processors and HPC-class InfiniBand interconnect, to deliver the best application performance, scalability, and value. Azure HPC enables users to unlock innovation, productivity, and business agility, through a highly available range of HPC & AI technologies that can be dynamically allocated as your business and technical needs change. This learning path is a series of modules that help you get started on Azure HPC - you