Hello @Steve Homer ,
Currently, the product team working on the public document or tutorial on How can we parameterise Spark jobs.
For now, you can use the job definition JSON file to parameterize the Spark job. Attached one sample file:
{
"targetBigDataPool": {
"referenceName": "yifso-1019",
"type": "SparkComputeReference"
},
"requiredSparkVersion": "2.4",
"jobProperties": {
"name": "job definition sample",
"file": "wasbs://ContainerName@StorageName.blob.core.windows.net/SparkSubmission/artifact/default_artifact.jar",
"className": "sample.LogQuery",
"args": [],
"jars": [],
"pyFiles": [],
"archives": [],
"files": [],
"conf": {
"spark.hadoop.fs.azure.account.key.StorageName.blob.core.windows.net": "StorageAccessKey"
},
"numExecutors": 2,
"executorCores": 4,
"executorMemory": "14g",
"driverCores": 4,
"driverMemory": "14g"
}
}
The job definition JSON can be modified, imported, and run directly.
Hope this helps. Do let us know if you any further queries.
------------
- Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
- Want a reminder to come back and check responses? Here is how to subscribe to a notification.