Job - Create

Service:: Data Lake Analytics

API Version:: 2016-11-01

Submits a job to the specified Data Lake Analytics account.

PUT https://{accountName}.{adlaJobDnsSuffix}/Jobs/{jobIdentity}?api-version=2016-11-01

URI Parameters

Name	In	Required	Type	Description
accountName	path	True	string	The Azure Data Lake Analytics account to execute job operations on.
adlaJobDnsSuffix	path	True	string	Gets the DNS suffix used as the base for all Azure Data Lake Analytics Job service requests.
jobIdentity	path	True	string (uuid)	Job identifier. Uniquely identifies the job across all jobs submitted to the service.
api-version	query	True	string	Client Api Version.

Request Body

Name	Required	Type	Description
name	True	string	the friendly name of the job to submit.
properties	True	CreateJobProperties: CreateUSqlJobProperties	the job specific properties.
type	True	JobType	the job type of the current job (Hive or USql).
degreeOfParallelism		integer (int32)	the degree of parallelism used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used.
degreeOfParallelismPercent		number (double)	the degree of parallelism in percentage used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used for degreeOfParallelism.
logFilePatterns		string[]	the list of log file name patterns to find in the logFolder. '' is the only matching character allowed. Example format: jobExecution.log or mylog.txt
priority		integer (int32)	the priority value to use for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0.
related		JobRelationshipProperties	the recurring job relationship information properties.

Responses

Name	Type	Description
200 OK	JobInformation	Successfully submitted the job.

Examples

Submits a job to the specified Data Lake Analytics account

Sample request

HTTP

PUT https://account123.contosopipelineservice.com/Jobs/076713da-9018-41ae-a3bd-9eab14e54d09?api-version=2016-11-01

{
  "type": "USql",
  "properties": {
    "runtimeVersion": "test_runtime_version",
    "script": "test_script",
    "type": "USql"
  },
  "name": "test_name",
  "degreeOfParallelism": 1,
  "priority": 1,
  "logFilePatterns": [
    "test_log_file_pattern_1",
    "test_log_file_pattern_2"
  ],
  "related": {
    "pipelineId": "076713da-9018-41ae-a3bd-9eab14e54d09",
    "pipelineName": "test_pipeline_name",
    "pipelineUri": "https://account123.contosopipelineservice.com/076713da-9018-41ae-a3bd-9eab14e54d09",
    "runId": "67034c12-b250-468e-992d-39fb978bde2c",
    "recurrenceId": "67034c12-b250-468e-992d-39fb978bde2d",
    "recurrenceName": "test_recurrence_name"
  }
}

Sample response

Status code:: 200

{
  "jobId": "076713da-9018-41ae-a3bd-9eab14e54d09",
  "name": "test_name",
  "type": "USql",
  "submitter": "test_submitter",
  "degreeOfParallelism": 1,
  "priority": 1,
  "submitTime": "2017-04-18T11:16:49.0748958-07:00",
  "startTime": "2017-04-18T11:16:49.0748958-07:00",
  "endTime": "2017-04-18T11:16:49.0748958-07:00",
  "state": "Accepted",
  "result": "Succeeded",
  "logFolder": "adl://contosoadla.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/logs/",
  "logFilePatterns": [
    "test_log_file_pattern_1",
    "test_log_file_pattern_2"
  ],
  "related": {
    "pipelineId": "076713da-9018-41ae-a3bd-9eab14e54d09",
    "pipelineName": "test_pipeline_name",
    "pipelineUri": "https://account123.contosopipelineservice.com/076713da-9018-41ae-a3bd-9eab14e54d09",
    "runId": "67034c12-b250-468e-992d-39fb978bde2c",
    "recurrenceId": "67034c12-b250-468e-992d-39fb978bde2d",
    "recurrenceName": "test_recurrence_name"
  },
  "errorMessage": [
    {
      "description": "test_description",
      "details": "test_details",
      "endOffset": 1,
      "errorId": "test_error_id",
      "filePath": "adl://contosoadla.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/test_file.txt",
      "helpLink": "https://azure.microsoft.com/en-us/blog/introducing-azure-data-lake/",
      "internalDiagnostics": "test_internal_diagnostics",
      "lineNumber": 1,
      "message": "test_message",
      "resolution": "test_resolution",
      "innerError": {
        "diagnosticCode": 1,
        "severity": "Warning",
        "details": "test_details",
        "component": "test_component",
        "errorId": "test_error_id",
        "helpLink": "https://azure.microsoft.com/en-us/blog/introducing-azure-data-lake/",
        "internalDiagnostics": "test_internal_diagnostics",
        "message": "test_message",
        "resolution": "test_resolution",
        "source": "SYSTEM",
        "description": "test_description"
      },
      "severity": "Warning",
      "source": "SYSTEM",
      "startOffset": 1
    }
  ],
  "stateAuditRecords": [
    {
      "newState": "test_new_state",
      "timeStamp": "2017-04-18T11:16:49.0748958-07:00",
      "requestedByUser": "test_requested_by_user",
      "details": "test_details"
    }
  ],
  "properties": {
    "runtimeVersion": "test_runtime_version",
    "script": "test_script",
    "type": "USql"
  }
}

Definitions

Name	Description
CompileMode	the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode.
CreateJobParameters	The parameters used to submit a new Data Lake Analytics job.
CreateUSqlJobProperties	U-SQL job properties used when submitting U-SQL jobs.
Diagnostics	Error diagnostic information for failed jobs.
HiveJobProperties	Hive job properties used when retrieving Hive jobs.
JobDataPath	A Data Lake Analytics job data path item.
JobErrorDetails	The Data Lake Analytics job error details.
JobInformation	The extended Data Lake Analytics job information properties returned when retrieving a specific job.
JobInnerError	The Data Lake Analytics job error details.
JobRelationshipProperties	Job relationship information properties including pipeline information, correlation information, etc.
JobResource	The Data Lake Analytics job resources.
JobResourceType	the job resource type.
JobResult	the result of job execution or the current result of the running job.
JobState	the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details.
JobStateAuditRecord	The Data Lake Analytics job state audit records for tracking the lifecycle of a job.
JobStatistics	The Data Lake Analytics job execution statistics.
JobStatisticsVertexStage	The Data Lake Analytics job statistics vertex stage information.
JobType	the job type of the current job (Hive or USql).
SeverityTypes	the severity of the error.
USqlJobProperties	U-SQL job properties used when retrieving U-SQL jobs.

CompileMode

Enumeration

the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode.

Value	Description
Semantic
Full
SingleBox

CreateJobParameters

Object

The parameters used to submit a new Data Lake Analytics job.

Name	Type	Default value	Description
degreeOfParallelism	integer (int32)	1	the degree of parallelism used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used.
degreeOfParallelismPercent	number (double)		the degree of parallelism in percentage used for this job. At most one of degreeOfParallelism and degreeOfParallelismPercent should be specified. If none, a default value of 1 will be used for degreeOfParallelism.
logFilePatterns	string[]		the list of log file name patterns to find in the logFolder. '' is the only matching character allowed. Example format: jobExecution.log or mylog.txt
name	string		the friendly name of the job to submit.
priority	integer (int32)		the priority value to use for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0.
properties	CreateJobProperties: CreateUSqlJobProperties		the job specific properties.
related	JobRelationshipProperties		the recurring job relationship information properties.
type	JobType		the job type of the current job (Hive or USql).

CreateUSqlJobProperties

Object

U-SQL job properties used when submitting U-SQL jobs.

Name	Type	Description
compileMode	CompileMode	the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode.
runtimeVersion	string	the runtime version of the Data Lake Analytics engine to use for the specific type of job being run.
script	string	the script to run. Please note that the maximum script size is 3 MB.
type	string: USql	the job type of the current job (i.e. USql).

Diagnostics

Object

Error diagnostic information for failed jobs.

Name	Type	Description
columnNumber	integer (int32)	the column where the error occurred.
end	integer (int32)	the ending index of the error.
lineNumber	integer (int32)	the line number the error occurred on.
message	string	the error message.
severity	SeverityTypes	the severity of the error.
start	integer (int32)	the starting index of the error.

HiveJobProperties

Object

Hive job properties used when retrieving Hive jobs.

Name	Type	Description
executedStatementCount	integer (int32)	the number of statements that have been run based on the script
logsLocation	string	the Hive logs location
outputLocation	string	the location of Hive job output files (both execution output and results)
runtimeVersion	string	the runtime version of the Data Lake Analytics engine to use for the specific type of job being run.
script	string	the script to run. Please note that the maximum script size is 3 MB.
statementCount	integer (int32)	the number of statements that will be run based on the script
type	string: Hive	the job type of the current job (i.e. Hive or USql).

JobDataPath

Object

A Data Lake Analytics job data path item.

Name	Type	Description
command	string	the command that this job data relates to.
jobId	string (uuid)	the id of the job this data is for.
paths	string[]	the list of paths to all of the job data.

JobErrorDetails

Object

The Data Lake Analytics job error details.

Name	Type	Description
description	string	the error message description
details	string	the details of the error message.
endOffset	integer (int32)	the end offset in the job where the error was found.
errorId	string	the specific identifier for the type of error encountered in the job.
filePath	string	the path to any supplemental error files, if any.
helpLink	string	the link to MSDN or Azure help for this type of error, if any.
innerError	JobInnerError	the inner error of this specific job error message, if any.
internalDiagnostics	string	the internal diagnostic stack trace if the user requesting the job error details has sufficient permissions it will be retrieved, otherwise it will be empty.
lineNumber	integer (int32)	the specific line number in the job where the error occurred.
message	string	the user friendly error message for the failure.
resolution	string	the recommended resolution for the failure, if any.
severity	SeverityTypes	the severity level of the failure.
source	string	the ultimate source of the failure (usually either SYSTEM or USER).
startOffset	integer (int32)	the start offset in the job where the error was found

JobInformation

Object

The extended Data Lake Analytics job information properties returned when retrieving a specific job.

Name	Type	Default value	Description
degreeOfParallelism	integer (int32)	1	the degree of parallelism used for this job.
degreeOfParallelismPercent	number (double)		the degree of parallelism in percentage used for this job.
endTime	string (date-time)		the completion time of the job.
errorMessage	JobErrorDetails[]		the error message details for the job, if the job failed.
hierarchyQueueNode	string		the name of hierarchy queue node this job is assigned to, null if job has not been assigned yet or the account doesn't have hierarchy queue.
jobId	string (uuid)		the job's unique identifier (a GUID).
logFilePatterns	string[]		the list of log file name patterns to find in the logFolder. '' is the only matching character allowed. Example format: jobExecution.log or mylog.txt
logFolder	string		the log folder path to use in the following format: adl://<accountName>.azuredatalakestore.net/system/jobservice/jobs/Usql/2016/03/13/17/18/5fe51957-93bc-4de0-8ddc-c5a4753b068b/logs/.
name	string		the friendly name of the job.
priority	integer (int32)		the priority value for the current job. Lower numbers have a higher priority. By default, a job has a priority of 1000. This must be greater than 0.
properties	JobProperties: HiveJobProperties USqlJobProperties		the job specific properties.
related	JobRelationshipProperties		the recurring job relationship information properties.
result	JobResult		the result of job execution or the current result of the running job.
startTime	string (date-time)		the start time of the job.
state	JobState		the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details.
stateAuditRecords	JobStateAuditRecord[]		the job state audit records, indicating when various operations have been performed on this job.
submitTime	string (date-time)		the time the job was submitted to the service.
submitter	string		the user or account that submitted the job.
type	JobType		the job type of the current job (Hive or USql).

JobInnerError

Object

The Data Lake Analytics job error details.

Name	Type	Description
component	string	the component that failed.
description	string	the error message description
details	string	the details of the error message.
diagnosticCode	integer (int32)	the diagnostic error code.
errorId	string	the specific identifier for the type of error encountered in the job.
helpLink	string	the link to MSDN or Azure help for this type of error, if any.
innerError	JobInnerError	the inner error of this specific job error message, if any.
internalDiagnostics	string	the internal diagnostic stack trace if the user requesting the job error details has sufficient permissions it will be retrieved, otherwise it will be empty.
message	string	the user friendly error message for the failure.
resolution	string	the recommended resolution for the failure, if any.
severity	SeverityTypes	the severity level of the failure.
source	string	the ultimate source of the failure (usually either SYSTEM or USER).

JobRelationshipProperties

Object

Job relationship information properties including pipeline information, correlation information, etc.

Name	Type	Description
pipelineId	string (uuid)	the job relationship pipeline identifier (a GUID).
pipelineName	string maxLength: 260	the friendly name of the job relationship pipeline, which does not need to be unique.
pipelineUri	string	the pipeline uri, unique, links to the originating service for this pipeline.
recurrenceId	string (uuid)	the recurrence identifier (a GUID), unique per activity/script, regardless of iterations. This is something to link different occurrences of the same job together.
recurrenceName	string maxLength: 260	the recurrence name, user friendly name for the correlation between jobs.
runId	string (uuid)	the run identifier (a GUID), unique identifier of the iteration of this pipeline.

JobResource

Object

The Data Lake Analytics job resources.

Name	Type	Description
name	string	the name of the resource.
resourcePath	string	the path to the resource.
type	JobResourceType	the job resource type.

JobResourceType

Enumeration

the job resource type.

Value	Description
VertexResource
JobManagerResource
StatisticsResource
VertexResourceInUserFolder
JobManagerResourceInUserFolder
StatisticsResourceInUserFolder

JobResult

Enumeration

the result of job execution or the current result of the running job.

Value	Description
None
Succeeded
Cancelled
Failed

JobState

Enumeration

the job state. When the job is in the Ended state, refer to Result and ErrorMessage for details.

Value	Description
Accepted
Compiling
Ended
New
Queued
Running
Scheduling
Starting
Paused
WaitingForCapacity

JobStateAuditRecord

Object

The Data Lake Analytics job state audit records for tracking the lifecycle of a job.

Name	Type	Description
details	string	the details of the audit log.
newState	string	the new state the job is in.
requestedByUser	string	the user who requests the change.
timeStamp	string (date-time)	the time stamp that the state change took place.

JobStatistics

Object

The Data Lake Analytics job execution statistics.

Name	Type	Description
finalizingTimeUtc	string (date-time)	the job finalizing start time.
lastUpdateTimeUtc	string (date-time)	the last update time for the statistics.
stages	JobStatisticsVertexStage[]	the list of stages for the job.

JobStatisticsVertexStage

Object

The Data Lake Analytics job statistics vertex stage information.

Name	Type	Description
dataRead	integer (int64)	the amount of data read, in bytes.
dataReadCrossPod	integer (int64)	the amount of data read across multiple pods, in bytes.
dataReadIntraPod	integer (int64)	the amount of data read in one pod, in bytes.
dataToRead	integer (int64)	the amount of data remaining to be read, in bytes.
dataWritten	integer (int64)	the amount of data written, in bytes.
duplicateDiscardCount	integer (int32)	the number of duplicates that were discarded.
failedCount	integer (int32)	the number of failures that occurred in this stage.
maxVertexDataRead	integer (int64)	the maximum amount of data read in a single vertex, in bytes.
minVertexDataRead	integer (int64)	the minimum amount of data read in a single vertex, in bytes.
readFailureCount	integer (int32)	the number of read failures in this stage.
revocationCount	integer (int32)	the number of vertices that were revoked during this stage.
runningCount	integer (int32)	the number of currently running vertices in this stage.
scheduledCount	integer (int32)	the number of currently scheduled vertices in this stage
stageName	string	the name of this stage in job execution.
succeededCount	integer (int32)	the number of vertices that succeeded in this stage.
tempDataWritten	integer (int64)	the amount of temporary data written, in bytes.
totalCount	integer (int32)	the total vertex count for this stage.
totalFailedTime	string (duration)	the amount of time that failed vertices took up in this stage.
totalProgress	integer (int32)	the current progress of this stage, as a percentage.
totalSucceededTime	string (duration)	the amount of time all successful vertices took in this stage.

JobType

Enumeration

the job type of the current job (Hive or USql).

Value	Description
USql
Hive

SeverityTypes

Enumeration

the severity of the error.

Value	Description
Warning
Error
Info
SevereWarning
Deprecated
UserWarning

USqlJobProperties

Object

U-SQL job properties used when retrieving U-SQL jobs.

Name	Type	Description
algebraFilePath	string	the algebra file path after the job has completed
compileMode	CompileMode	the specific compilation mode for the job used during execution. If this is not specified during submission, the server will determine the optimal compilation mode.
debugData	JobDataPath	the job specific debug data locations.
diagnostics	Diagnostics[]	the diagnostics for the job.
resources	JobResource[]	the list of resources that are required by the job
rootProcessNodeId	string	the ID used to identify the job manager coordinating job execution. This value should not be set by the user and will be ignored if it is.
runtimeVersion	string	the runtime version of the Data Lake Analytics engine to use for the specific type of job being run.
script	string	the script to run. Please note that the maximum script size is 3 MB.
statistics	JobStatistics	the job specific statistics.
totalCompilationTime	string (duration)	the total time this job spent compiling. This value should not be set by the user and will be ignored if it is.
totalPauseTime	string (duration)	the total time this job spent paused. This value should not be set by the user and will be ignored if it is.
totalQueuedTime	string (duration)	the total time this job spent queued. This value should not be set by the user and will be ignored if it is.
totalRunningTime	string (duration)	the total time this job spent executing. This value should not be set by the user and will be ignored if it is.
type	string: USql	the job type of the current job (i.e. Hive or USql).
yarnApplicationId	string	the ID used to identify the yarn application executing the job. This value should not be set by the user and will be ignored if it is.
yarnApplicationTimeStamp	integer (int64)	the timestamp (in ticks) for the yarn application executing the job. This value should not be set by the user and will be ignored if it is.

Share via

Job - Create

URI Parameters

Request Body

Responses

Examples

Submits a job to the specified Data Lake Analytics account

Sample request

Sample response

Definitions

CompileMode

CreateJobParameters

CreateUSqlJobProperties

Diagnostics

HiveJobProperties

JobDataPath

JobErrorDetails

JobInformation

JobInnerError

JobRelationshipProperties

JobResource

JobResourceType

JobResult

JobState

JobStateAuditRecord

JobStatistics

JobStatisticsVertexStage

JobType

SeverityTypes

USqlJobProperties