I've created an Azure Synapse workspace (and related resources like ADLS Gen2 storage, and Apache Spark pool). Uploaded the Spark Pi example JAR to the linked ADLS Gen2 storage and created a Spark Job definition to run the same Spark Pi example. However, I am seeing error in the Spark Job submit.
On the Spark Job Definition page within Azure Synapse Studio, I am seeing these messages:
4:02:37 PM Submit Apache Spark job start.
Submitting job "Spark job definition 1"...
4:09:00 PM Failed to submit the Spark job
Spark monitoring URL: ...
On the Spark Application monitor page, these are the logs from Livy:
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/04/14 21:04:07 INFO ShutdownHookManager: Shutdown hook called
21/04/14 21:04:07 INFO ShutdownHookManager: Deleting directory /tmp/spark-9d17628a-ed29-44a1-b7f2-0e895c53c519
21/04/14 21:04:07 INFO MetricsSystemImpl: Stopping azure-file-system metrics system...
21/04/14 21:04:07 INFO MetricsSystemImpl: azure-file-system metrics system stopped.
21/04/14 21:04:07 INFO MetricsSystemImpl: azure-file-system metrics system shutdown complete.
stderr:
YARN Diagnostics:
No YARN application is found with tag livy-batch-0-bpyt81ry in 300 seconds. This may be because 1) spark-submit fail to submit application to YARN; or 2) YARN cluster doesn't have enough resources to start the application in time. Please check Livy log and YARN log to know the details.
The Spark Application monitor page also says:
This application failed due to the total number of errors: 1. View error details
The error details page contains this error:
{"StatusCode":500,"Message":"Authorization property is not specified in http request header, and/or incoming traffic is not from private link.","ExceptionDetail":"System.Exception: Authorization property is not specified in http request header, and/or incoming traffic is not from private link.\r\n at Microsoft.Analytics.Clusters.Common.Web.PubSubAuthorizationMiddleWare.AuthorizeAsync(HttpContext context) in C:\\source\\Shared\\Web\\PubSubAuthorizationMiddleWare.cs:line 175\r\n at Microsoft.Analytics.Clusters.Common.Web.PubSubAuthorizationMiddleWare.InvokeAsync(HttpContext context) in C:\\source\\Shared\\Web\\PubSubAuthorizationMiddleWare.cs:line 73\r\n at Microsoft.Analytics.Clusters.Common.Web.ExceptionMiddleware.InvokeAsync(HttpContext httpContext) in C:\\source\\Shared\\Web\\ExceptionMiddleware.cs:line 54","ErrorType":"None","ErrorNumber":0,"ErrorOn":"2021-04-14T21:09:48.9457637+00:00"}
The job definition is very simple and I believe I've setup all the access control right.
Could you please explain me why I am getting this error?
Just one thing though, I would like to mention here. I had overall
Contributors
role (inherited from the subscription), which let me upload/delete (all operation) on the linked ADLS Gen2 storage. So, this is kind of misleading. Plus the error could have been little better to state the problem.We've been working with the product team on security and RBAC roles- it's a common area of confusion. I'll include this error in the areas where we can improve.
I had the same issue.
After adding "Storage Blob Data Contributor" role , issue has been resolved and job succeeded.
Sign in to comment