Job - Create
Submits a job to the specified Data Lake Analytics account.
PUT https://{accountName}.{adlaJobDnsSuffix}/Jobs/{jobIdentity}?api-version=2016-11-01
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
account
|
path | True |
string |
The Azure Data Lake Analytics account to execute job operations on. |
adla
|
path | True |
string |
Gets the DNS suffix used as the base for all Azure Data Lake Analytics Job service requests. |
job
|
path | True |
string (uuid) |
Job identifier. Uniquely identifies the job across all jobs submitted to the service. |
api-version
|
query | True |
string |
Client Api Version. |
Request Body
Name | Required | Type | Description |
---|---|---|---|
name | True |
string |
the friendly name of the job to submit. |
properties | True | CreateJobProperties: |
the job specific properties. |
type | True |
the job type of the current job (Hive or USql). |
|
degreeOfParallelism |
integer (int32) |
the degree of parallelism used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used. |
|
degreeOfParallelismPercent |
number (double) |
the degree of parallelism in percentage used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used for degreeOfParallelism. |
|
logFilePatterns |
string[] |
the list of log file name patterns to find in the logFolder. '' is the only matching character allowed. Example format: jobExecution.log or mylog.txt |
|
priority |
integer (int32) |
the priority value to use for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0. |
|
related |
the recurring job relationship information properties. |
Responses
Name | Type | Description |
---|---|---|
200 OK |
Successfully submitted the job. |
Examples
Submits a job to the specified Data Lake Analytics account
Sample request
PUT https://account123.contosopipelineservice.com/Jobs/076713da-9018-41ae-a3bd-9eab14e54d09?api-version=2016-11-01
{
"type": "USql",
"properties": {
"runtimeVersion": "test_runtime_version",
"script": "test_script",
"type": "USql"
},
"name": "test_name",
"degreeOfParallelism": 1,
"priority": 1,
"logFilePatterns": [
"test_log_file_pattern_1",
"test_log_file_pattern_2"
],
"related": {
"pipelineId": "076713da-9018-41ae-a3bd-9eab14e54d09",
"pipelineName": "test_pipeline_name",
"pipelineUri": "https://account123.contosopipelineservice.com/076713da-9018-41ae-a3bd-9eab14e54d09",
"runId": "67034c12-b250-468e-992d-39fb978bde2c",
"recurrenceId": "67034c12-b250-468e-992d-39fb978bde2d",
"recurrenceName": "test_recurrence_name"
}
}
Sample response
{
"jobId": "076713da-9018-41ae-a3bd-9eab14e54d09",
"name": "test_name",
"type": "USql",
"submitter": "test_submitter",
"degreeOfParallelism": 1,
"priority": 1,
"submitTime": "2017-04-18T11:16:49.0748958-07:00",
"startTime": "2017-04-18T11:16:49.0748958-07:00",
"endTime": "2017-04-18T11:16:49.0748958-07:00",
"state": "Accepted",
"result": "Succeeded",
"logFolder": "adl://contosoadla.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/logs/",
"logFilePatterns": [
"test_log_file_pattern_1",
"test_log_file_pattern_2"
],
"related": {
"pipelineId": "076713da-9018-41ae-a3bd-9eab14e54d09",
"pipelineName": "test_pipeline_name",
"pipelineUri": "https://account123.contosopipelineservice.com/076713da-9018-41ae-a3bd-9eab14e54d09",
"runId": "67034c12-b250-468e-992d-39fb978bde2c",
"recurrenceId": "67034c12-b250-468e-992d-39fb978bde2d",
"recurrenceName": "test_recurrence_name"
},
"errorMessage": [
{
"description": "test_description",
"details": "test_details",
"endOffset": 1,
"errorId": "test_error_id",
"filePath": "adl://contosoadla.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/test_file.txt",
"helpLink": "https://azure.microsoft.com/en-us/blog/introducing-azure-data-lake/",
"internalDiagnostics": "test_internal_diagnostics",
"lineNumber": 1,
"message": "test_message",
"resolution": "test_resolution",
"innerError": {
"diagnosticCode": 1,
"severity": "Warning",
"details": "test_details",
"component": "test_component",
"errorId": "test_error_id",
"helpLink": "https://azure.microsoft.com/en-us/blog/introducing-azure-data-lake/",
"internalDiagnostics": "test_internal_diagnostics",
"message": "test_message",
"resolution": "test_resolution",
"source": "SYSTEM",
"description": "test_description"
},
"severity": "Warning",
"source": "SYSTEM",
"startOffset": 1
}
],
"stateAuditRecords": [
{
"newState": "test_new_state",
"timeStamp": "2017-04-18T11:16:49.0748958-07:00",
"requestedByUser": "test_requested_by_user",
"details": "test_details"
}
],
"properties": {
"runtimeVersion": "test_runtime_version",
"script": "test_script",
"type": "USql"
}
}
Definitions
Name | Description |
---|---|
Compile |
the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode. |
Create |
The parameters used to submit a new Data Lake Analytics job. |
Create |
U-SQL job properties used when submitting U-SQL jobs. |
Diagnostics |
Error diagnostic information for failed jobs. |
Hive |
Hive job properties used when retrieving Hive jobs. |
Job |
A Data Lake Analytics job data path item. |
Job |
The Data Lake Analytics job error details. |
Job |
The extended Data Lake Analytics job information properties returned when retrieving a specific job. |
Job |
The Data Lake Analytics job error details. |
Job |
Job relationship information properties including pipeline information, correlation information, etc. |
Job |
The Data Lake Analytics job resources. |
Job |
the job resource type. |
Job |
the result of job execution or the current result of the running job. |
Job |
the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details. |
Job |
The Data Lake Analytics job state audit records for tracking the lifecycle of a job. |
Job |
The Data Lake Analytics job execution statistics. |
Job |
The Data Lake Analytics job statistics vertex stage information. |
Job |
the job type of the current job (Hive or USql). |
Severity |
the severity of the error. |
USql |
U-SQL job properties used when retrieving U-SQL jobs. |
CompileMode
the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode.
Value | Description |
---|---|
Semantic | |
Full | |
SingleBox |
CreateJobParameters
The parameters used to submit a new Data Lake Analytics job.
Name | Type | Default value | Description |
---|---|---|---|
degreeOfParallelism |
integer (int32) |
1 |
the degree of parallelism used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used. |
degreeOfParallelismPercent |
number (double) |
the degree of parallelism in percentage used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used for degreeOfParallelism. |
|
logFilePatterns |
string[] |
the list of log file name patterns to find in the logFolder. '' is the only matching character allowed. Example format: jobExecution.log or mylog.txt |
|
name |
string |
the friendly name of the job to submit. |
|
priority |
integer (int32) |
the priority value to use for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0. |
|
properties | CreateJobProperties: |
the job specific properties. |
|
related |
the recurring job relationship information properties. |
||
type |
the job type of the current job (Hive or USql). |
CreateUSqlJobProperties
U-SQL job properties used when submitting U-SQL jobs.
Name | Type | Description |
---|---|---|
compileMode |
the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode. |
|
runtimeVersion |
string |
the runtime version of the Data Lake Analytics engine to use for the specific type of job being run. |
script |
string |
the script to run. Please note that the maximum script size is 3 MB. |
type |
string:
USql |
the job type of the current job (i.e. USql). |
Diagnostics
Error diagnostic information for failed jobs.
Name | Type | Description |
---|---|---|
columnNumber |
integer (int32) |
the column where the error occurred. |
end |
integer (int32) |
the ending index of the error. |
lineNumber |
integer (int32) |
the line number the error occurred on. |
message |
string |
the error message. |
severity |
the severity of the error. |
|
start |
integer (int32) |
the starting index of the error. |
HiveJobProperties
Hive job properties used when retrieving Hive jobs.
Name | Type | Description |
---|---|---|
executedStatementCount |
integer (int32) |
the number of statements that have been run based on the script |
logsLocation |
string |
the Hive logs location |
outputLocation |
string |
the location of Hive job output files (both execution output and results) |
runtimeVersion |
string |
the runtime version of the Data Lake Analytics engine to use for the specific type of job being run. |
script |
string |
the script to run. Please note that the maximum script size is 3 MB. |
statementCount |
integer (int32) |
the number of statements that will be run based on the script |
type |
string:
Hive |
the job type of the current job (i.e. Hive or USql). |
JobDataPath
A Data Lake Analytics job data path item.
Name | Type | Description |
---|---|---|
command |
string |
the command that this job data relates to. |
jobId |
string (uuid) |
the id of the job this data is for. |
paths |
string[] |
the list of paths to all of the job data. |
JobErrorDetails
The Data Lake Analytics job error details.
Name | Type | Description |
---|---|---|
description |
string |
the error message description |
details |
string |
the details of the error message. |
endOffset |
integer (int32) |
the end offset in the job where the error was found. |
errorId |
string |
the specific identifier for the type of error encountered in the job. |
filePath |
string |
the path to any supplemental error files, if any. |
helpLink |
string |
the link to MSDN or Azure help for this type of error, if any. |
innerError |
the inner error of this specific job error message, if any. |
|
internalDiagnostics |
string |
the internal diagnostic stack trace if the user requesting the job error details has sufficient permissions it will be retrieved, otherwise it will be empty. |
lineNumber |
integer (int32) |
the specific line number in the job where the error occurred. |
message |
string |
the user friendly error message for the failure. |
resolution |
string |
the recommended resolution for the failure, if any. |
severity |
the severity level of the failure. |
|
source |
string |
the ultimate source of the failure (usually either SYSTEM or USER). |
startOffset |
integer (int32) |
the start offset in the job where the error was found |
JobInformation
The extended Data Lake Analytics job information properties returned when retrieving a specific job.
Name | Type | Default value | Description |
---|---|---|---|
degreeOfParallelism |
integer (int32) |
1 |
the degree of parallelism used for this job. |
degreeOfParallelismPercent |
number (double) |
the degree of parallelism in percentage used for this job. |
|
endTime |
string (date-time) |
the completion time of the job. |
|
errorMessage |
the error message details for the job, if the job failed. |
||
hierarchyQueueNode |
string |
the name of hierarchy queue node this job is assigned to, null if job has not been assigned yet or the account doesn't have hierarchy queue. |
|
jobId |
string (uuid) |
the job's unique identifier (a GUID). |
|
logFilePatterns |
string[] |
the list of log file name patterns to find in the logFolder. '' is the only matching character allowed. Example format: jobExecution.log or mylog.txt |
|
logFolder |
string |
the log folder path to use in the following format: adl://<accountName>.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/logs/. |
|
name |
string |
the friendly name of the job. |
|
priority |
integer (int32) |
the priority value for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0. |
|
properties | JobProperties: |
the job specific properties. |
|
related |
the recurring job relationship information properties. |
||
result |
the result of job execution or the current result of the running job. |
||
startTime |
string (date-time) |
the start time of the job. |
|
state |
the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details. |
||
stateAuditRecords |
the job state audit records, indicating when various operations have been performed on this job. |
||
submitTime |
string (date-time) |
the time the job was submitted to the service. |
|
submitter |
string |
the user or account that submitted the job. |
|
type |
the job type of the current job (Hive or USql). |
JobInnerError
The Data Lake Analytics job error details.
Name | Type | Description |
---|---|---|
component |
string |
the component that failed. |
description |
string |
the error message description |
details |
string |
the details of the error message. |
diagnosticCode |
integer (int32) |
the diagnostic error code. |
errorId |
string |
the specific identifier for the type of error encountered in the job. |
helpLink |
string |
the link to MSDN or Azure help for this type of error, if any. |
innerError |
the inner error of this specific job error message, if any. |
|
internalDiagnostics |
string |
the internal diagnostic stack trace if the user requesting the job error details has sufficient permissions it will be retrieved, otherwise it will be empty. |
message |
string |
the user friendly error message for the failure. |
resolution |
string |
the recommended resolution for the failure, if any. |
severity |
the severity level of the failure. |
|
source |
string |
the ultimate source of the failure (usually either SYSTEM or USER). |
JobRelationshipProperties
Job relationship information properties including pipeline information, correlation information, etc.
Name | Type | Description |
---|---|---|
pipelineId |
string (uuid) |
the job relationship pipeline identifier (a GUID). |
pipelineName |
string maxLength: 260 |
the friendly name of the job relationship pipeline, which does not need to be unique. |
pipelineUri |
string |
the pipeline uri, unique, links to the originating service for this pipeline. |
recurrenceId |
string (uuid) |
the recurrence identifier (a GUID), unique per activity/script, regardless of iterations. This is something to link different occurrences of the same job together. |
recurrenceName |
string maxLength: 260 |
the recurrence name, user friendly name for the correlation between jobs. |
runId |
string (uuid) |
the run identifier (a GUID), unique identifier of the iteration of this pipeline. |
JobResource
The Data Lake Analytics job resources.
Name | Type | Description |
---|---|---|
name |
string |
the name of the resource. |
resourcePath |
string |
the path to the resource. |
type |
the job resource type. |
JobResourceType
the job resource type.
Value | Description |
---|---|
VertexResource | |
JobManagerResource | |
StatisticsResource | |
VertexResourceInUserFolder | |
JobManagerResourceInUserFolder | |
StatisticsResourceInUserFolder |
JobResult
the result of job execution or the current result of the running job.
Value | Description |
---|---|
None | |
Succeeded | |
Cancelled | |
Failed |
JobState
the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details.
Value | Description |
---|---|
Accepted | |
Compiling | |
Ended | |
New | |
Queued | |
Running | |
Scheduling | |
Starting | |
Paused | |
WaitingForCapacity |
JobStateAuditRecord
The Data Lake Analytics job state audit records for tracking the lifecycle of a job.
Name | Type | Description |
---|---|---|
details |
string |
the details of the audit log. |
newState |
string |
the new state the job is in. |
requestedByUser |
string |
the user who requests the change. |
timeStamp |
string (date-time) |
the time stamp that the state change took place. |
JobStatistics
The Data Lake Analytics job execution statistics.
Name | Type | Description |
---|---|---|
finalizingTimeUtc |
string (date-time) |
the job finalizing start time. |
lastUpdateTimeUtc |
string (date-time) |
the last update time for the statistics. |
stages |
the list of stages for the job. |
JobStatisticsVertexStage
The Data Lake Analytics job statistics vertex stage information.
Name | Type | Description |
---|---|---|
dataRead |
integer (int64) |
the amount of data read, in bytes. |
dataReadCrossPod |
integer (int64) |
the amount of data read across multiple pods, in bytes. |
dataReadIntraPod |
integer (int64) |
the amount of data read in one pod, in bytes. |
dataToRead |
integer (int64) |
the amount of data remaining to be read, in bytes. |
dataWritten |
integer (int64) |
the amount of data written, in bytes. |
duplicateDiscardCount |
integer (int32) |
the number of duplicates that were discarded. |
failedCount |
integer (int32) |
the number of failures that occurred in this stage. |
maxVertexDataRead |
integer (int64) |
the maximum amount of data read in a single vertex, in bytes. |
minVertexDataRead |
integer (int64) |
the minimum amount of data read in a single vertex, in bytes. |
readFailureCount |
integer (int32) |
the number of read failures in this stage. |
revocationCount |
integer (int32) |
the number of vertices that were revoked during this stage. |
runningCount |
integer (int32) |
the number of currently running vertices in this stage. |
scheduledCount |
integer (int32) |
the number of currently scheduled vertices in this stage |
stageName |
string |
the name of this stage in job execution. |
succeededCount |
integer (int32) |
the number of vertices that succeeded in this stage. |
tempDataWritten |
integer (int64) |
the amount of temporary data written, in bytes. |
totalCount |
integer (int32) |
the total vertex count for this stage. |
totalFailedTime |
string (duration) |
the amount of time that failed vertices took up in this stage. |
totalProgress |
integer (int32) |
the current progress of this stage, as a percentage. |
totalSucceededTime |
string (duration) |
the amount of time all successful vertices took in this stage. |
JobType
the job type of the current job (Hive or USql).
Value | Description |
---|---|
USql | |
Hive |
SeverityTypes
the severity of the error.
Value | Description |
---|---|
Warning | |
Error | |
Info | |
SevereWarning | |
Deprecated | |
UserWarning |
USqlJobProperties
U-SQL job properties used when retrieving U-SQL jobs.
Name | Type | Description |
---|---|---|
algebraFilePath |
string |
the algebra file path after the job has completed |
compileMode |
the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode. |
|
debugData |
the job specific debug data locations. |
|
diagnostics |
the diagnostics for the job. |
|
resources |
the list of resources that are required by the job |
|
rootProcessNodeId |
string |
the ID used to identify the job manager coordinating job execution. This value should not be set by the user and will be ignored if it is. |
runtimeVersion |
string |
the runtime version of the Data Lake Analytics engine to use for the specific type of job being run. |
script |
string |
the script to run. Please note that the maximum script size is 3 MB. |
statistics |
the job specific statistics. |
|
totalCompilationTime |
string (duration) |
the total time this job spent compiling. This value should not be set by the user and will be ignored if it is. |
totalPauseTime |
string (duration) |
the total time this job spent paused. This value should not be set by the user and will be ignored if it is. |
totalQueuedTime |
string (duration) |
the total time this job spent queued. This value should not be set by the user and will be ignored if it is. |
totalRunningTime |
string (duration) |
the total time this job spent executing. This value should not be set by the user and will be ignored if it is. |
type |
string:
USql |
the job type of the current job (i.e. Hive or USql). |
yarnApplicationId |
string |
the ID used to identify the yarn application executing the job. This value should not be set by the user and will be ignored if it is. |
yarnApplicationTimeStamp |
integer (int64) |
the timestamp (in ticks) for the yarn application executing the job. This value should not be set by the user and will be ignored if it is. |