Hello @Steve Homer ,
Currently, the product team working on the public document or tutorial on How can we parameterise Spark jobs.
For now, you can use the job definition JSON file to parameterize the Spark job. Attached one sample file:
{
"targetBigDataPool": {
"referenceName": "yifso-1019",
"type": "SparkComputeReference"
},
"requiredSparkVersion": "2.4",
"jobProperties": {
"name": "job definition sample",
"file": "wasbs://ContainerName@StorageName.blob.core.windows.net/SparkSubmission/artifact/default_artifact.jar",
"className": "sample.LogQuery",
"args": [],
"jars": [],
"pyFiles": [],
"archives": [],
"files": [],
"conf": {
"spark.hadoop.fs.azure.account.key.StorageName.blob.core.windows.net": "StorageAccessKey"
},
"numExecutors": 2,
"executorCores": 4,
"executorMemory": "14g",
"driverCores": 4,
"driverMemory": "14g"
}
}
The job definition JSON can be modified, imported, and run directly.
Hope this helps. Do let us know if you any further queries.
------------
- Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
- Want a reminder to come back and check responses? Here is how to subscribe to a notification.
Hello @Steve Homer ,
We are still waiting for a response from the product team. I will let you know once I heard back from the product team.
Stay Tuned!
In our case we are trying to parameterize the arguments to the spark job, when executed from the REST api:
(sparkJobDefinitions/myjob001/execute)
Any help would be much appreciated. The only alternative seems to be to create lots of similar job definitions that are identical, aside from the arguments.