Synapse Serverless CETAS fails with error "Fatal exception occurred: bad allocation".

Finn Schmidt 86

Hello,

I am trying to create an external table (CETAS) from a large amount of fairly small json files, so that they can be queried more efficiently. The json files are stored on ADLS. Previously this worked fine, when i let the query run for 1 - 1.5 hours, but now i am repeatedly getting the error message

Request to perform an external distributed computation has failed with error "Fatal exception occurred: bad allocation"

Does anyone have experience with this 'bad allocation' error? It sounds like a memory issue to me, but I am unsure if there's something I can do about it.

Any help would be much appreciated!

PRADEEPCHEEKATLA 90,226 Reputation points

2024-04-09T05:07:13.1966667+00:00

@Finn Schmidt - Thanks for the question and using MS Q&A platform.

It seems like you are encountering an error while creating an external table (CETAS) from a large amount of small JSON files. The error message "Fatal exception occurred: bad allocation" indicates that there might be a memory issue.

One possible reason for this error could be that the amount of data you are trying to query is too large for the memory available. You can try to optimize your query by filtering out unnecessary data or by breaking down the query into smaller chunks.

Another possible reason could be that the memory allocated to your Synapse workspace is not sufficient. You can try increasing the memory allocated to your workspace and see if that resolves the issue.

You can also try to monitor the memory usage of your workspace while running the query to identify any potential memory leaks or bottlenecks.

For more details, refer to Monitor resource utilization and query activity in Azure Synapse Analytics.

If none of these solutions work, you can try opening a support for further assistance.

Hope this helps. Do let us know if you any further queries.
Finn Schmidt 86 Reputation points

2024-04-09T07:39:58.25+00:00

Hello @PRADEEPCHEEKATLA-MSFT

Thank you for your response.

It seems to me like you are suggesting methods that relate to the dedicated synapse sql pool - i Am running the cetas on the serverless pool though. Therefore I'd expect the capacity to autoscale to handle the request (since there is no dedicated capacity that I can scale up or down myself).

After several more attempts I have gotten the cetas to run successfully. What i find strange is that there is a large variation in performance. While it initially took 1-1.5 hours to create the external tables, it only took about 20 minutes on the last attempts (and of course many of the attempts inbetween didn't complete at all due to the error described above).

Is it possible that there is some issue in the autoscaling of the serverless pool, so that it doesn't always provision sufficient resources to handle the cetas request? How would other concurrent queries running on the same synapse workspace affect this?
PRADEEPCHEEKATLA 90,226 Reputation points

2024-04-10T04:04:17.8933333+00:00

@Finn Schmidt - I apologize for the confusion in my previous response. You are correct that the suggestions I provided were related to the dedicated Synapse SQL pool, and not the serverless pool.

Regarding your question about the serverless pool, you are correct that the capacity should autoscale to handle the request. However, it is possible that there was a temporary issue with the autoscaling or resource provisioning that caused the error you encountered.

As for the variation in performance, it is possible that the serverless pool was able to provision more resources during the successful attempts, which allowed the query to complete faster. Other concurrent queries running on the same Synapse workspace can affect the performance of your query, as they may be competing for resources.

To optimize the performance of your query, you can try running it during off-peak hours when there are fewer concurrent queries running. You can also try optimizing your query by filtering out unnecessary data or breaking it down into smaller chunks.

If you continue to experience issues with the serverless pool, you can open a support ticket for further assistance.

Hope this helps. Do let us know if you any further queries.
Finn Schmidt 86 Reputation points

2024-04-10T10:00:09.0966667+00:00

@PRADEEPCHEEKATLA

thank you for confirming.
I suppose my only real option then is to implement a more robust retry strategy for when the serverless pool is unable to provide sufficient resources the handle the request.
Finn Schmidt 86 Reputation points

2024-04-10T10:18:45.02+00:00

edit - accidentally duplicated response above
PRADEEPCHEEKATLA 90,226 Reputation points

2024-04-12T06:19:50.1733333+00:00

@Finn Schmidt - Yes, implementing a more robust retry strategy could be a good option to handle situations where the serverless pool is unable to provide sufficient resources to handle the request.

You can also consider using the Synapse dedicated SQL pool instead of the serverless pool if you require more consistent performance and greater control over resource allocation. The dedicated SQL pool allows you to scale up or down the resources allocated to your workspace based on your specific needs.

However, if you prefer to use the serverless pool, you can try to optimize your query and retry the request if it fails due to insufficient resources. You can also monitor the resource usage of your workspace to identify any potential bottlenecks or resource constraints that could be causing the performance issues.

Hope this helps. Do let us know if you any further queries.
Finn Schmidt 86 Reputation points

2024-04-12T06:56:49.88+00:00

@PRADEEPCHEEKATLA

You can also monitor the resource usage of your workspace to identify any potential bottlenecks or resource constraints that could be causing the performance issues.

Where / How can i monitor these usage metrics? As far as I know the serverless pool only exposes the bytes processed metric for billing purposes, and that takes a while to update, so it cannot be used to identify bottlenecks effectively.
PRADEEPCHEEKATLA 90,226 Reputation points

2024-04-12T07:00:28.3033333+00:00

@Finn Schmidt - You are correct that the serverless pool only exposes the bytes processed metric for billing purposes, which may not be sufficient for identifying bottlenecks or resource constraints.

However, you can use the Azure Synapse Analytics Studio to monitor the resource usage of your workspace. The Studio provides a dashboard that displays various metrics related to the performance and resource usage of your workspace, including CPU usage, memory usage, and query performance.

To access the dashboard, you can navigate to the "Monitor" tab in the Synapse Studio and select the "Workspace" option. From there, you can view various charts and graphs that display the resource usage and performance metrics for your workspace.

You can also use the Azure Monitor service to monitor the resource usage of your Synapse workspace. Azure Monitor provides various metrics and logs related to the performance and resource usage of your workspace, which can be used to identify bottlenecks or resource constraints.

To set up Azure Monitor for your Synapse workspace, you can follow the instructions provided in the Azure Synapse Analytics documentation. Once set up, you can view the metrics and logs in the Azure Monitor dashboard or export them to other monitoring tools for further analysis.

Hope this helps. Do let us know if you any further queries.
PRADEEPCHEEKATLA 90,226 Reputation points

2024-04-15T05:22:20.6966667+00:00

@Finn Schmidt - We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Finn Schmidt 86 Reputation points

2024-04-15T10:07:27.65+00:00

@PRADEEPCHEEKATLA
As stated before, the only solution I can see is to implement more retry strategies, and manually intervene when needed. I don't really consider this a solution to the problem though, more of a workaround. Regarding the metrics you mentioned: Most of those (CPU usage, memory usage, ...) are once again related to the dedicated synapse sql pool.
PRADEEPCHEEKATLA 90,226 Reputation points

2024-04-16T07:43:20.5333333+00:00

@Finn Schmidt - I apologize for the confusion. You are correct that the metrics I mentioned are related to the dedicated Synapse SQL pool and may not be applicable to your scenario with the serverless pool.

In the case of the serverless pool, you can try optimizing your query by filtering out unnecessary data or by breaking down the query into smaller chunks. This can help reduce the memory usage and prevent the "bad allocation" error.

If these solutions do not work, you can try to open a support ticket for further assistance. They may be able to provide more specific guidance based on your specific scenario.

Share via

Synapse Serverless CETAS fails with error "Fatal exception occurred: bad allocation".

Your answer