How to monitor Synapse Spark real time monitor

Jiacheng Zhang 20 Reputation points
2025-01-17T05:26:34.89+00:00

Hi Team,

We are wondering how to monitor Spark Pool performance metrics in real time (ex CPU, Memory, Errors).

In Apache Spark Pool section, I found chart can monitor how many Spark application, Vcore allocation or memory allocation

User's image

But seems this one is only for the resource allocation, for example, if this stream job with spark application using 3 small nodes, which is 12 Vcore, then here will shows allocated 13 Vcores and total 96 GB is allocated. But this is for total resource allocation, not real memory consumption. May I know how can we monitor real time Spark Job performance within our existing tools like Synapse, notebook Spark code, or Grafana.

Also, in the Synapse workspace portal, the monitor section, I found there are more metrics, and there is a section within this metrics called Streaming Job, it includs metrics like Resource % utilization, Runtime errors, etc, which seems really useful. But im not sure if this is for monitor Spark Job, since it shows nothing when I select it. Thanks so much!

User's image

User's image

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
{count} votes

1 answer

Sort by: Most helpful
  1. Anonymous
    2025-01-17T05:51:18.9366667+00:00

    @Jiacheng Zhang

    Thanks for reaching out to Microsoft Q&A.

    To monitor Spark Pool performance metrics in real time within Azure Synapse, you can use several tools and methods:

    Synapse Studio:

    • Monitor Apache Spark Applications: In Synapse Studio, navigate to the Monitor section and select Apache Spark applications. Here, you can view the status, issues, and progress of your Spark applications. You can also access detailed logs and diagnostics.

    Screenshot of Apache Spark applications.

    • Monitor Apache Spark Pools: In the same Monitor section, select Apache Spark pools to see the status of your pools, including vCore usage and other metrics.

    Sample filter

    for detailed explanation please refer:https://learn.microsoft.com/en-us/azure/synapse-analytics/monitoring/apache-spark-applications

    https://learn.microsoft.com/en-us/azure/synapse-analytics/monitoring/how-to-monitor-spark-pools

    Azure Log Analytics:

    • Integration with Synapse: You can configure Azure Synapse to send telemetry data to Azure Log Analytics. This allows you to query Spark logs, set up alerts, and monitor performance metrics such as CPU, memory, and errors in real time

    https://dustinvannoy.com/2022/05/12/monitor-synapse-spark-with-log-analytics/

    Grafana:

    • Using Azure Monitor: You can integrate Azure Monitor with Grafana to visualize real-time metrics. Azure Monitor collects data from your Synapse Spark jobs, which can then be displayed in Grafana dashboards.

    Notebook Spark Code:

    • Custom Metrics: You can write custom Spark code in your notebooks to collect and display performance metrics. This can include tracking memory usage, CPU utilization, and error rates within your Spark jobs.

    It seems like the Streaming Job metrics you found in the Synapse workspace portal might not be directly related to Spark jobs. Instead, focus on the Apache Spark applications and Apache Spark pools sections for more relevant metrics.

    Hope this helps. Please do let us know if you any further queries.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.