New-AzureHDInsightMapReduceJobDefinition
Defines a new MapReduce job.
Note
The cmdlets referenced in this documentation are for managing legacy Azure resources that use Azure Service Manager (ASM) APIs. This legacy PowerShell module isn't recommended when creating new resources since ASM is scheduled for retirement. For more information, see Azure Service Manager retirement.
The Az PowerShell module is the recommended PowerShell module for managing Azure Resource Manager (ARM) resources with PowerShell.
Syntax
New-AzureHDInsightMapReduceJobDefinition
[-Arguments <String[]>]
-ClassName <String>
[-Defines <Hashtable>]
[-Files <String[]>]
-JarFile <String>
[-JobName <String>]
[-LibJars <String[]>]
[-StatusFolder <String>]
[-Profile <AzureSMProfile>]
[<CommonParameters>]
Description
This version of Azure PowerShell HDInsight is deprecated. These cmdlets will be removed by January 1, 2017. Please use the newer version of Azure PowerShell HDInsight.
For information about how to use the new HDInsight to create a cluster, see Create Linux-based clusters in HDInsight using Azure PowerShell (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-create-linux-clusters-azure-powershell/). For information about how to submit jobs by using Azure PowerShell and other approaches, see Submit Hadoop jobs in HDInsight (https://azure.microsoft.com/en-us/documentation/articles/hdinsight-submit-hadoop-jobs-programmatically/). For reference information about Azure PowerShell HDInsight, see Azure HDInsight Cmdlets.
The New-AzureHDInsightMapReduceJobDefinition cmdlet defines a new MapReduce job to run on an Azure HDInsight cluster.
Examples
Example 1: Define a MapReduce job, run the job, and get the output
PS C:\>$SubId = (Get-AzureSubscription -Current).SubscriptionId
PS C:\> $ClusterName = "MyCluster"
PS C:\> $WordCountJob = New-AzureHDInsightMapReduceJobDefinition -JarFile "/Example/Apps/Hadoop-examples.jar" -ClassName "WordCount" -Defines @{ "mapred.map.tasks" = "3" } -Arguments "/Example/Data/Gutenberg/Davinci.txt", "/Example/Output/WordCount"
PS C:\> $WordCountJob | Start-AzureHDInsightJob -Cluster $ClusterName
| Wait-AzureHDInsightJob -Subscription $SubId -WaitTimeoutInSeconds 3600
| Get-AzureHDInsightJobOutput -Cluster $ClusterName -Subscription $SubId -StandardError
The first command gets the ID of the current subscription, and then stores it in the $SubId variable.
The second command assigns the name MyCluster to the $Clustername variable.
The third command uses the New-AzureHDInsightMapReduceJobDefinition cmdlet to create a MapReduce job definition, and then store it in the $WordCountJob variable.
The fourth command performs a sequence of operations by using these cmdlets:
- Start-AzureHDInsightJob to start the job on $ClusterName.
- Wait-AzureHDInsightJob to wait for the job to finish and to display the progress toward completion.
- Get-AzureHDInsightJobOutput to get the job output.
Parameters
-Arguments
Specifies an array of arguments for a Hadoop job. The arguments are passed as command-line arguments to each task.
Type: | String[] |
Aliases: | Args |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-ClassName
Specifies the name of the job class in the Java Archive (JAR) file.
Type: | String |
Aliases: | Class |
Position: | Named |
Default value: | None |
Required: | True |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-Defines
Specifies Hadoop configuration values to set when the job runs.
Type: | Hashtable |
Aliases: | Params |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-Files
Specifies an array of WASB files that are required for a job.
Type: | String[] |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-JarFile
Specifies the fully qualified name of a JAR file that contains the code and dependencies of a MapReduce job.
Type: | String |
Aliases: | Jar |
Position: | Named |
Default value: | None |
Required: | True |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-JobName
Specifies the name of a MapReduce job. This parameter is optional. If you do not specify this parameter, the value of the ClassName parameter is used.
Type: | String |
Aliases: | Name |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-LibJars
Specifies an array of LibJar references of the job.
Type: | String[] |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-Profile
Specifies the Azure profile from which this cmdlet reads. If you do not specify a profile, this cmdlet reads from the local default profile.
Type: | AzureSMProfile |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |
-StatusFolder
Specifies the location of the folder that contains standard outputs and error outputs for a job, including its exit code and task logs.
Type: | String |
Position: | Named |
Default value: | None |
Required: | False |
Accept pipeline input: | False |
Accept wildcard characters: | False |