Is there a way to collect Synapse's Spark UI logs through an API?

Question

Is there a way to automate the process to send get request to receive bearer token for this API:

https://{synapse_workspace_name}.dev.azuresynapse.net/sparkhistory/api/v1/sparkpools/{spark_pool_name}/livyid/{livy_id}/applications/{application_id}/1/executors

This is a Spark API for synapse which give the metrics on executor level for a Spark job in Synapse for example in the photo attached:
Screenshot 2024-07-05 131330

I am building a pipeline that extract information like {spark_pool_name}, {livy_id}, and {application_id} for each spark jobs and extract the metrics for each application ID.

Answer

Hello S,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

Problem

I understand that you would like to collect Synapse's Spark UI logs through an API and automate the process of obtaining a bearer token for the Synapse Spark API.

Solution

To solve the challenges, you have number of options.

Collecting Synapse's Spark UI logs through an API,

you can do the followings:

You can enable the Synapse Studio connector built into Log Analytics. This allows you to collect and send Apache Spark application metrics and logs to your Log Analytics workspace.
1. By creating a Log Analytics workspace, you can do this via the Azure portal, Azure CLI, or PowerShell.
2. Prepare an Apache Spark configuration file with the necessary parameters:
```
      spark.synapse.logAnalytics.enabled true       
      spark.synapse.logAnalytics.workspaceId        spark.synapse.logAnalytics.secret 
```
  Replace parameters with your actual values.
3. Configure the workspace information in Synapse Studio.
4. Submit your Apache Spark application, and the logs and metrics will be sent to your Log Analytics workspace.
5. Visualize the metrics and logs using an Azure Monitor workbook. link: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-azure-log-analytics
You can download the completed application log via a curl command. For the driver log, use bash for direct API Request as shown below:

curl "https://{workspace}.dev.azuresynapse.net/sparkhistory/api/v1/sparkpools/{sparkPool}/livyid/{livyId}/applications/{appId}/driverlog/stderr/?isDownload=true" -H "authorization:Bearer {AccessToken}"

You would have to replace `{workspace}`, `{sparkPool}`, `{livyId}`, `{appId}`, and `{AccessToken}` with your actual values. Also, remember to choose the approach that best fits your requirements and workflow. Similar answer on Q&A: [https://learn.microsoft.com/en-us/answers/questions/253744/synapse-spark-logs. ](https://learn.microsoft.com/en-us/answers/questions/253744/synapse-spark-logs.

)

Now,

Automate the process of obtaining a bearer token for the Synapse Spark API

You'll need to obtain a bearer token to authenticate your requests. The token is a lightweight security token that grants access to protected resources. You can use PowerShell to get the bearer token as it's shown below:
1. Azure Management Endpoint (Workspace): $token = (Get-AzAccessToken -Resource "https://management.azure.com").Token
2. Synapse DEV Endpoint (Workspace Resources): $token = (Get-AzAccessToken -Resource "https://dev.azuresynapse.net").Token
3. Make sure you're using the correct endpoint based on your use case. Link: https://techcommunity.microsoft.com/t5/azure-synapse-analytics-blog/calling-synapse-rest-api-to-automate-tasks-using-powershell/ba-p/2202814.
Now that you have the bearer token, you can construct your GET request to the Spark API. By replacing {synapse_workspace_name}, {spark_pool_name}, {livy_id}, and {application_id} with the actual values from your pipeline as an example API URL you gave in the question:
```
   https://{synapse_workspace_name}.dev.azuresynapse.net/sparkhistory/api/v1/sparkpools/{spark_pool_name}/livyid/{livy_id}/applications/{application_id}/1/executors
```
Lastly, using the constructed URL to retrieve executor-level metrics for your Spark job in Synapse.

Accept Answer

I hope this is helpful! Do not hesitate to let me know if you have any other questions.

** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.

Best Regards,

Sina Salam

Share via

Is there a way to collect Synapse's Spark UI logs through an API?

1 answer

Problem

Solution

Collecting Synapse's Spark UI logs through an API,

Automate the process of obtaining a bearer token for the Synapse Spark API

Accept Answer

Your answer