Kit SDK HDInsight pour PythonHDInsight SDK for Python

Article
07/03/2019

Vue d'ensembleOverview

Le kit SDK HDInsight pour Python fournit des classes et des méthodes qui vous permettent de gérer vos clusters HDInsight.The HDInsight SDK for Python provides classes and methods that allow you to manage your HDInsight clusters. Il inclut des opérations permettant de créer, supprimer, mettre à jour, répertorier, mettre à l’échelle, exécuter des actions de script, surveiller, obtenir des propriétés des clusters HDInsight, et bien plus encore.It includes operations to create, delete, update, list, resize, execute script actions, monitor, get properties of HDInsight clusters, and more.

PrérequisPrerequisites

Un compte Azure.An Azure account. Si vous n’en avez pas, inscrivez-vous pour un essai gratuit.If you don't have one, get a free trial.
PythonPython
pippip

Installation du Kit de développement logiciel (SDK)SDK Installation

Le kit SDK HDInsight pour Python se trouve dans l’index du package Python et peut être installé en exécutant :The HDInsight SDK for Python can be found in the Python Package Index and can be installed by running:

pip install azure-mgmt-hdinsight

AuthenticationAuthentication

Le kit de développement logiciel (SDK) doit d’abord être authentifié avec votre abonnement Azure.The SDK first needs to be authenticated with your Azure subscription. Suivez l’exemple ci-dessous pour créer un principal de service et l’utiliser pour s’authentifier.Follow the example below to create a service principal and use it to authenticate. Une fois cette opération terminée, vous avez une instance de HDInsightManagementClient, qui contient de nombreuses méthodes (décrites dans les sections suivantes) pouvant être utilisées pour effectuer des opérations de gestion.After this is done, you will have an instance of an HDInsightManagementClient, which contains many methods (outlined in below sections) that can be used to perform management operations.

Notes

Il existe d’autres façons de s’authentifier, en plus de l’exemple suivant, peut-être mieux adaptées à vos besoins.There are other ways to authenticate besides the below example that could potentially be better suited for your needs. Toutes les méthodes sont décrites ici : S’authentifier avec les bibliothèques de gestion Azure pour PythonAll methods are outlined here: Authenticate with the Azure Management Libraries for Python

Exemple d’authentification à l’aide d’un principal de serviceAuthentication Example Using a Service Principal

Tout d’abord, connectez-vous à Azure Cloud Shell.First, login to Azure Cloud Shell. Vérifiez que vous utilisez actuellement l’abonnement dans lequel vous souhaitez que le principal de service soit créé.Verify you are currently using the subscription in which you want the service principal created.

az account show

Les informations relatives à votre abonnement sont affichées au format JSON.Your subscription information is displayed as JSON.

{
  "environmentName": "AzureCloud",
  "id": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "isDefault": true,
  "name": "XXXXXXX",
  "state": "Enabled",
  "tenantId": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "user": {
    "cloudShellID": true,
    "name": "XXX@XXX.XXX",
    "type": "user"
  }
}

Si vous n’êtes pas connecté au bon abonnement, sélectionnez le bon en exécutant :If you're not logged into the correct subscription, select the correct one by running:

az account set -s <name or ID of subscription>

Important

Si vous n’avez pas déjà enregistré le fournisseur de ressources HDInsight avec une autre méthode (par exemple, en créant un cluster HDInsight via le portail Azure), vous devez le faire une fois avant de pouvoir vous authentifier.If you have not already registered the HDInsight Resource Provider by another method (such as by creating an HDInsight Cluster through the Azure Portal), you need to do this once before you can authenticate. Vous pouvez le faire à partir d’Azure Cloud Shell en exécutant la commande suivante :This can be done from the Azure Cloud Shell by running the following command:

az provider register --namespace Microsoft.HDInsight

Ensuite, choisissez un nom pour votre principal de service et créez-le avec la commande suivante :Next, choose a name for your service principal and create it with the following command:

az ad sp create-for-rbac --name <Service Principal Name> --sdk-auth

Les informations relatives au principal de service sont affichées en tant que JSON.The service principal information is displayed as JSON.

{
  "clientId": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "clientSecret": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "subscriptionId": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "tenantId": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "activeDirectoryEndpointUrl": "https://login.microsoftonline.com",
  "resourceManagerEndpointUrl": "https://management.azure.com/",
  "activeDirectoryGraphResourceId": "https://graph.windows.net/",
  "sqlManagementEndpointUrl": "https://management.core.windows.net:8443/",
  "galleryEndpointUrl": "https://gallery.azure.com/",
  "managementEndpointUrl": "https://management.core.windows.net/"
}

Copiez l’extrait de code Python ci-dessous et remplissez TENANT_ID, CLIENT_ID, CLIENT_SECRET et SUBSCRIPTION_ID avec les chaînes du code JSON qui a été renvoyé après l’exécution de la commande pour créer le principal de service.Copy the below Python snippet and fill in TENANT_ID, CLIENT_ID, CLIENT_SECRET, and SUBSCRIPTION_ID with the strings from the JSON that was returned after running the command to create the service principal.

from azure.mgmt.hdinsight import HDInsightManagementClient
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.hdinsight.models import *

# Tenant ID for your Azure Subscription
TENANT_ID = ''
# Your Service Principal App Client ID
CLIENT_ID = ''
# Your Service Principal Client Secret
CLIENT_SECRET = ''
# Your Azure Subscription ID
SUBSCRIPTION_ID = ''

credentials = ServicePrincipalCredentials(
    client_id = CLIENT_ID,
    secret = CLIENT_SECRET,
    tenant = TENANT_ID
)

client = HDInsightManagementClient(credentials, SUBSCRIPTION_ID)

Gestion du clusterCluster Management

Notes

Cette section suppose que vous avez déjà authentifié et construit une instance HDInsightManagementClient que vous avez conservée dans une variable appelée client.This section assumes you have already authenticated and constructed an HDInsightManagementClient instance and store it in a variable called client. Les instructions relatives à l’authentification et à l’obtention d’un HDInsightManagementClient se trouvent dans la section Authentification ci-dessus.Instructions for authenticating and obtaining an HDInsightManagementClient can be found in the Authentication section above.

Créer un clusterCreate a Cluster

Un nouveau cluster peut être créé en appelant client.clusters.create().A new cluster can be created by calling client.clusters.create().

ExemplesSamples

Des exemples de code pour la création de plusieurs types courants de clusters HDInsight sont disponibles : HDInsight Python Samples.Code samples for creating several common types of HDInsight clusters are available: HDInsight Python Samples.

ExemplesExample

Cet exemple montre comment créer un cluster Spark avec 2 nœuds principaux et un nœud de travail.This example demonstrates how to create a Spark cluster with 2 head nodes and 1 worker node.

Notes

Vous devez d’abord créer un groupe de ressources et un compte de stockage, comme expliqué ci-dessous.You first need to create a Resource Group and Storage Account, as explained below. Si vous les avez déjà créés, vous pouvez ignorer ces étapes.If you have already created these, you can skip these steps.

Création d’un groupe de ressourcesCreating a Resource Group

Vous pouvez créer un groupe de ressources à l’aide d’Azure Cloud Shell en exécutantYou can create a resource group using the Azure Cloud Shell by running

az group create -l <Region Name (i.e. eastus)> --n <Resource Group Name>

Création d’un compte de stockageCreating a Storage Account

Vous pouvez créer un compte de stockage à l’aide d’Azure Cloud Shell en exécutant :You can create a storage account using the Azure Cloud Shell by running:

az storage account create -n <Storage Account Name> -g <Existing Resource Group Name> -l <Region Name (i.e. eastus)> --sku <SKU i.e. Standard_LRS>

Ensuite, exécutez la commande suivante pour obtenir la clé de votre compte de stockage (vous en aurez besoin pour créer un cluster) :Now run the following command to get the key for your storage account (you will need this to create a cluster):

az storage account keys list -n <Storage Account Name>

L’extrait de code Python ci-dessous crée un cluster Spark avec 2 nœuds principaux et 1 nœud Worker.The below Python snippet creates a Spark cluster with 2 head nodes and 1 worker node. Remplissez les variables vides comme expliqué dans les commentaires et n’hésitez pas à modifier d’autres paramètres en fonction de vos besoins.Fill in the blank variables as explained in the comments and feel free to change other parameters to suit your specific needs.

# The name for the cluster you are creating
cluster_name = ""
# The name of your existing Resource Group
resource_group_name = ""
# Choose a username
username = ""
# Choose a password
password = ""
# Replace <> with the name of your storage account
storage_account = "<>.blob.core.windows.net"
# Storage account key you obtained above
storage_account_key = ""
# Choose a region
location = ""
container = "default"

params = ClusterCreateProperties(
    cluster_version="3.6",
    os_type=OSType.linux,
    tier=Tier.standard,
    cluster_definition=ClusterDefinition(
        kind="spark",
        configurations={
            "gateway": {
                "restAuthCredential.enabled_credential": "True",
                "restAuthCredential.username": username,
                "restAuthCredential.password": password
            }
        }
    ),
    compute_profile=ComputeProfile(
        roles=[
            Role(
                name="headnode",
                target_instance_count=2,
                hardware_profile=HardwareProfile(vm_size="Large"),
                os_profile=OsProfile(
                    linux_operating_system_profile=LinuxOperatingSystemProfile(
                        username=username,
                        password=password
                    )
                )
            ),
            Role(
                name="workernode",
                target_instance_count=1,
                hardware_profile=HardwareProfile(vm_size="Large"),
                os_profile=OsProfile(
                    linux_operating_system_profile=LinuxOperatingSystemProfile(
                        username=username,
                        password=password
                    )
                )
            )
        ]
    ),
    storage_profile=StorageProfile(
        storageaccounts=[StorageAccount(
            name=storage_account,
            key=storage_account_key,
            container=container,
            is_default=True
        )]
    )
)

client.clusters.create(
    cluster_name=cluster_name,
    resource_group_name=resource_group_name,
    parameters=ClusterCreateParametersExtended(
        location=location,
        tags={},
        properties=params
    ))

Obtenir les détails du clusterGet Cluster Details

Pour obtenir les propriétés d’un cluster donné :To get properties for a given cluster:

client.clusters.get("<Resource Group Name>", "<Cluster Name>")

ExemplesExample

Vous pouvez utiliser get pour confirmer que vous avez créé votre cluster avec succès.You can use get to confirm that you have successfully created your cluster.

my_cluster = client.clusters.get("<Resource Group Name>", "<Cluster Name>")
print(my_cluster)

La sortie doit ressembler à :The output should look like:

{'additional_properties': {}, 'id': '/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/<Resource Group Name>/providers/Microsoft.HDInsight/clusters/<Cluster Name>', 'name': '<Cluster Name>', 'type': 'Microsoft.HDInsight/clusters', 'location': '<Location>', 'tags': {}, 'etag': 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX', 'properties': <azure.mgmt.hdinsight.models.cluster_get_properties_py3.ClusterGetProperties object at 0x0000013766D68048>}

Répertorier les clustersList Clusters

Répertorier les clusters dans l’abonnementList Clusters Under The Subscription

client.clusters.list()

Répertorier les clusters par groupe de ressourcesList Clusters By Resource Group

client.clusters.list_by_resource_group("<Resource Group Name>")

Notes

list() et list_by_resource_group() retournent un objet ClusterPaged.Both list() and list_by_resource_group() return a ClusterPaged object. Appeler advance_page() renvoie une liste de clusters sur cette page et avance l’objet ClusterPaged à la page suivante.Calling advance_page() returns a list of clusters on that page and advances the ClusterPaged object to the next page. Cette opération peut être répétée jusqu’à ce qu’une exception StopIteration soit générée, indiquant qu’il n’y a plus d’autres pages.This can be repeated until a StopIteration exception is raised, indicating that there are no more pages.

ExemplesExample

L’exemple suivant imprime les propriétés de tous les clusters pour l’abonnement actuel :The following example prints the properties of all clusters for the current subscription:

clusters_paged = client.clusters.list()
while True:
  try:
    for cluster in clusters_paged.advance_page():
      print(cluster)
  except StopIteration: 
    break

Supprimer un clusterDelete a Cluster

Pour supprimer un cluster :To delete a cluster:

client.clusters.delete("<Resource Group Name>", "<Cluster Name>")

Mettre à jour les balises de clusterUpdate Cluster Tags

Vous pouvez mettre à jour les balises d’un cluster donné comme suit :You can update the tags of a given cluster like so:

client.clusters.update("<Resource Group Name>", "<Cluster Name>", tags={<Dictionary of Tags>})

ExemplesExample

client.clusters.update("<Resource Group Name>", "<Cluster Name>", tags={"tag1Name" : "tag1Value", "tag2Name" : "tag2Value"})

Redimensionner le clusterResize Cluster

Vous pouvez mettre à l’échelle un nombre donné de clusters de nœuds Worker en spécifiant une nouvelle taille comme suit :You can resize a given cluster's number of worker nodes by specifying a new size like so:

client.clusters.resize("<Resource Group Name>", "<Cluster Name>", target_instance_count=<Num of Worker Nodes>)

Cluster Monitoring (Surveillance des clusters)Cluster Monitoring

Le kit de développement logiciel (SDK) HDInsight Management peut également être utilisé pour gérer la surveillance de vos clusters via Operations Management Suite (OMS).The HDInsight Management SDK can also be used to manage monitoring on your clusters via the Operations Management Suite (OMS).

Activer la surveillance OMSEnable OMS Monitoring

Notes

Pour activer la supervision OMS, vous devez disposer d’un espace de travail Log Analytics existant.To enable OMS Monitoring, you must have an existing Log Analytics workspace. Si vous n’en n’avez pas déjà créé un, vous pouvez apprendre comment le faire ici : Créer un espace de travail Log Analytics dans le portail Azure.If you have not already created one, you can learn how to do that here: Create a Log Analytics workspace in the Azure portal.

Pour activer la surveillance OMS sur votre cluster :To enable OMS Monitoring on your cluster:

client.extension.enable_monitoring("<Resource Group Name>", "<Cluster Name>", workspace_id="<Workspace Id>")

Afficher l’état de surveillance OMSView Status Of OMS Monitoring

Pour obtenir l’état d’OMS sur votre cluster :To get the status of OMS on your cluster:

client.extension.get_monitoring_status("<Resource Group Name", "Cluster Name")

Désactiver la surveillance OMSDisable OMS Monitoring

Pour désactiver OMS sur votre cluster :To disable OMS on your cluster:

client.extension.disable_monitoring("<Resource Group Name>", "<Cluster Name>")

Actions de scriptScript Actions

HDInsight fournit une méthode de configuration intitulée actions de script, qui appelle des scripts personnalisés pour personnaliser le cluster.HDInsight provides a configuration method called script actions that invokes custom scripts to customize the cluster.

Notes

Vous trouverez plus d’informations sur l’utilisation des actions de script ici : Personnaliser des clusters HDInsight Linux avec des actions de scriptMore information on how to use script actions can be found here: Customize Linux-based HDInsight clusters using script actions

Exécuter des actions de scriptExecute Script Actions

Pour exécuter des actions de script sur un cluster donné :To execute script actions on a given cluster:

script_action1 = RuntimeScriptAction(name="<Script Name>", uri="<URL To Script>", roles=[<List of Roles>]) #valid roles are "headnode", "workernode", "zookeepernode", and "edgenode"

client.clusters.execute_script_actions("<Resource Group Name>", "<Cluster Name>", <persist_on_success (bool)>, script_actions=[script_action1]) #add more RuntimeScriptActions to the list to execute multiple scripts

Supprimer une action de scriptDelete Script Action

Pour supprimer une action de script persistante spécifiée sur un cluster donné :To delete a specified persisted script action on a given cluster:

client.script_actions.delete("<Resource Group Name>", "<Cluster Name", "<Script Name>")

Répertorier les actions de script persistantesList Persisted Script Actions

Notes

list() et list_persisted_scripts() renvoient un objet RuntimeScriptActionDetailPaged.list() and list_persisted_scripts() return a RuntimeScriptActionDetailPaged object. Appeler advance_page() retourne une liste de RuntimeScriptActionDetail sur cette page et avance l’objet RuntimeScriptActionDetailPaged à la page suivante.Calling advance_page() returns a list of RuntimeScriptActionDetail on that page and advances the RuntimeScriptActionDetailPaged object to the next page. Cette opération peut être répétée jusqu’à ce qu’une exception StopIteration soit générée, indiquant qu’il n’y a plus d’autres pages.This can be repeated until a StopIteration exception is raised, indicating that there are no more pages. Reportez-vous à l’exemple ci-dessous.See the example below.

Pour répertorier toutes les actions de script persistantes pour le cluster spécifié :To list all persisted script actions for the specified cluster:

client.script_actions.list_persisted_scripts("<Resource Group Name>", "<Cluster Name>")

ExemplesExample

scripts_paged = client.script_actions.list_persisted_scripts(resource_group_name, cluster_name)
while True:
  try:
    for script in scripts_paged.advance_page():
      print(script)
  except StopIteration:
    break

Répertorier tout l’historique d’exécution des scriptsList All Scripts' Execution History

Pour répertorier tout l’historique d’exécution des scripts pour le cluster spécifié :To list all scripts' execution history for the specified cluster:

client.script_execution_history.list("<Resource Group Name>", "<Cluster Name>")

ExemplesExample

Cet exemple imprime tous les détails de toutes les exécutions de script passées.This example prints all the details for all past script executions.

script_executions_paged = client.script_execution_history.list("<Resource Group Name>", "<Cluster Name>")
while True:
  try:
    for script in script_executions_paged.advance_page():            
      print(script)
    except StopIteration:       
      break

Partager via

Kit SDK HDInsight pour PythonHDInsight SDK for Python

Vue d'ensembleOverview

PrérequisPrerequisites

Installation du Kit de développement logiciel (SDK)SDK Installation

AuthenticationAuthentication

Exemple d’authentification à l’aide d’un principal de serviceAuthentication Example Using a Service Principal

Gestion du clusterCluster Management

Créer un clusterCreate a Cluster

ExemplesSamples

ExemplesExample

Création d’un groupe de ressourcesCreating a Resource Group

Création d’un compte de stockageCreating a Storage Account

Obtenir les détails du clusterGet Cluster Details

ExemplesExample

Répertorier les clustersList Clusters

Répertorier les clusters dans l’abonnementList Clusters Under The Subscription

Répertorier les clusters par groupe de ressourcesList Clusters By Resource Group

ExemplesExample

Supprimer un clusterDelete a Cluster

Mettre à jour les balises de clusterUpdate Cluster Tags

ExemplesExample

Redimensionner le clusterResize Cluster

Cluster Monitoring (Surveillance des clusters)Cluster Monitoring

Activer la surveillance OMSEnable OMS Monitoring

Afficher l’état de surveillance OMSView Status Of OMS Monitoring

Désactiver la surveillance OMSDisable OMS Monitoring

Actions de scriptScript Actions

Exécuter des actions de scriptExecute Script Actions

Supprimer une action de scriptDelete Script Action

Répertorier les actions de script persistantesList Persisted Script Actions

ExemplesExample

Répertorier tout l’historique d’exécution des scriptsList All Scripts' Execution History

ExemplesExample

Ressources supplémentaires