Synapse SQL serverless Pool Flooded with failing queries.
We have a situation where a Synapse Analytics environment is running with only a serverless SQL pool, and appears to have suddenly gotten stuck with the same few queries running in a loop. Each query floods the pool with 5-6 requests per second and is unable to be killed or stopped reliably.
We have turned off all pipelines and all triggers for all copy tasks but we are still seeing the same failed queries occurring on this looping basis. We have been unable to find a way to disable/ pause the serverless database or to turn off the Synapse resource. The large number of failed transient queries is resulting in any query hitting the SQL serverless pool to fail.
Is it possible to stop the queries running, pause the SQL Serverless pool in Synapse or pause Synapse?
How do we get to the point of breaking this unwanted loop of executions?
The looping query in question is structured as follows if it helps:
IF NOT EXISTS(SELECT * FROM sys.external_tables WHERE object_id = OBJECT_ID('[dbo].[my table]'))
BEGIN
CREATE EXTERNAL TABLE [dbo].[my table] AS SCHEMA_INFERRED_TABLE WITH ( LOCATION = 'my folder', DATA_SOURCE = [my datasource_ID], FILE_FORMAT = [DELTA] )
END
our error message is:
Query failed because of compute container failures or other transient errors, and retry exhausted. This can be an intermittent issue. Please retry later.
Azure Synapse Analytics
-
Amira Bedhiafi 27,531 Reputation points
2023-09-26T08:54:14.24+00:00 Can you try to pause the SQL pools ? if you can stop the query submissions, you should effectively "pause" any costs/usage.
As a temporary measure to block any external system from sending requests, you might consider adjusting the firewall settings to restrict all access. (make sure you don't lock yourself out, and it might affect other systems or users that require access.)
While the IF NOT EXISTS is a standard way to ensure idempotency, repeatedly running such a command could be problematic depending on the scale and frequency. Make sure that the
sys.external_tables
isn't overwhelmed or the data source isn't too large.Don't forget, it’s always a good idea to get Azure Support involved
-
Russell Bowden 15 Reputation points
2023-09-26T20:47:44.9066667+00:00 Unfortunately it's not possible in anyway to pause SQL Serverless pools from what I have found. Running a command to kill queries at scale isn't practical, just helped us to diagnose the problem and the extent of the looping issue
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-09-26T22:43:45.5666667+00:00 Hello Russell Bowden,
Welcome to the Microsoft Q&A forum.
The issue seems to be backend compute container failures or other transient errors.
Sometimes the issue could be intermittent. If you are still facing the issue, can you please log a support ticket to work with support engineers to fix the issue?
In case if you don't have a support plan, please let me know so that I can enable a one-time free support request to work on this.
I am looking forward to hearing from you.
-
Russell Bowden 15 Reputation points
2023-09-27T02:58:24.04+00:00 Thank you, a ticket has been raised and the team are looking into it
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-09-27T03:29:13.2+00:00 thank you, Russell Bowden
-
wout 1 Reputation point
2023-09-28T08:45:16.3+00:00 We have the same issue going on.
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-09-28T21:52:57.0533333+00:00 Hello wout and Russell Bowden,
The product team acknowledged the issue, and multiple customers were impacted at this time.
They are working on a fix. I will update this thread as soon as the issue gets resolved.
Thank you for your patience.
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-10-02T16:04:40.31+00:00 Hello wout and Russell Bowden,
The product team resolved the issue. Please let us know if you still see any issues further.
-
Antônio Farias 40 Reputation points
2023-10-03T20:33:47.6833333+00:00 I having this error, the issue was fixed??
failed:Query failed because of compute container failures or other transient errors, and retry exhausted. This can be an intermittent issue. Please retry later.
-
Antônio Farias 40 Reputation points
2023-10-03T21:01:09.4333333+00:00 After 40 minutes, the vision returned on its own.
-
Russell Bowden 15 Reputation points
2023-10-03T21:09:54.4833333+00:00 Currently still experiencing the same issue, I will do some investigation from pipelines on my side as well, but so far no change
-
Russell Bowden 15 Reputation points
2023-10-03T21:12:40.6633333+00:00 10:08:00 am
Started executing query at Line 1
Query [Stmt:{99F671B1-09D5-4527-ACCD-A7A8C41949C1}][DQHash:0x55F8D54CE6F69649][Sch:5f1c5431-5f9a-4b2d-8f65-6272fc9a96e8]_[Query:{DB7C9B23-A91C-4660-9B45-E638A458C8AE}] failed:Query failed because of compute container failures or other transient errors, and retry exhausted. This can be an intermittent issue. Please retry later.
Total execution time: 00:00:02.607
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-10-03T21:47:30.8166667+00:00 Thank you Russell Bowden and Antônio Farias.
Let me reach out to PG and get back to you with an update.
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-10-03T22:11:02.8633333+00:00 <update>
Hello Russell Bowden and Antônio Farias,
The issue is due to the backend compute container failures. PG advised to log a support case with the serverless SQL endpoint details. An on-call engineer can help you un block the issue from the backend.
In case if you don't have a support plan, please let me know so I can enable a one-time free support request to work on this issue.
-
Russell Bowden 15 Reputation points
2023-10-03T23:22:46.13+00:00 I will refer back to the original SR that was created above when this incident was first reported. Thank you
-
Meghan L. Calvo 0 Reputation points
2023-10-04T14:51:23.1533333+00:00 Hello @Bhargava-MSFT
We too are seeing this issue as of 7am ET today, but do not have a support plan. How can we get this unblocked?
Regards,
Meghan
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-10-04T16:49:38.61+00:00 Hello Meghan L. Calvo
Please send an email to AzCommunity@microsoft.com with the below details so that we can enable one-time-free support for you
Email subject: <Attn - Bhargava : Microsoft Q&A Thread title>
Thread URL: <Microsoft Q&A Thread>
Subscription ID: <your subscription id>
Looking forward to your reply.
Regards,
BhargavaGunnam-MSFT
AzCommunity@microsoft.com
-
Meghan L. Calvo 0 Reputation points
2023-10-04T17:15:50.7866667+00:00 Thank you, @BhargavaGunnam-MSFT
When I went to pull the subscription ID, I saw that it suddenly started working again as of 12:42pm ET. I will reach out to the email provided if we see this issue crop up again in the near future.
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-10-04T17:19:00.0766667+00:00 Sure, Meghan L. Calvo
If you still see the issue, don't hesitate to contact us directly via the email in the earlier comment.
Have a great day!
-
Antônio Farias 40 Reputation points
2023-10-04T20:08:23.8266667+00:00 I still having this issue, I'll send the email to azcommunity
-
Bhargava-MSFT 31,201 Reputation points • Microsoft Employee
2023-10-04T20:32:41.7866667+00:00 Sure, Antônio Farias
Thank you.
-
Gabe 1 Reputation point
2023-10-11T12:19:27.4533333+00:00 Team I am having the same issue here. In my case I have created a new Integration runtime using Time To Live. And even though it did reduced my run time from 4h to 2hours. I am seeing a lot of failure.
pl_Silver_Views { "errorCode": "BadRequest", "message": "Operation on target Silver Views DimDate failed: Operation on target Create View DimDate failed: Duplicate column ordinal cannot be provided in WITH schema clause.", "failureType": "UserError", "target": "pl_Silver_Views", "details": "" }
{ "errorCode": "BadRequest", "message": "Operation on target InvMovements failed: Operation on target GetOldWatermark failed: Failure happened on 'Source' side. ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Query [Stmt:{D6C26B5F-40A2-47AC-BD00-EF1949CE0F07}]_[DQHash:0xE0D603462A093F5B]_[Sch:4b43f990-a50e-4f75-8848-38ec979fbcae]_[Query:{D787942E-C61C-45F1-A2D2-45BED3937082}] failed:Query failed because of compute container failures or other transient errors, and retry exhausted. This can be an intermittent issue. Please retry later.\r\nStatement ID: {D6C26B5F-40A2-47AC-BD00-EF1949CE0F07} | Query hash: 0xE0D603462A093F5B | Distributed request ID: {D787942E-C61C-45F1-A2D2-45BED3937082}. Total size of data scanned is 0 megabytes, total size of data moved is 0 megabytes, total size of data written is 0 megabytes.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Query [Stmt:{D6C26B5F-40A2-47AC-BD00-EF1949CE0F07}]_[DQHash:0xE0D603462A093F5B]_[Sch:4b43f990-a50e-4f75-8848-38ec979fbcae]_[Query:{D787942E-C61C-45F1-A2D2-45BED3937082}] failed:Query failed because of compute container failures or other transient errors, and retry exhausted. This can be an intermittent issue. Please retry later.\r\nStatement ID: {D6C26B5F-40A2-47AC-BD00-EF1949CE0F07} | Query hash: 0xE0D603462A093F5B | Distributed request ID: {D787942E-C61C-45F1-A2D2-45BED3937082}. Total size of data scanned is 0 megabytes, total size of data moved is 0 megabytes, total size of data written is 0 megabytes.,Source=.Net SqlClient Data Provider,SqlErrorNumber=70004,Class=17,ErrorCode=-2146232060,State=1,Errors=[{Class=17,Number=70004,State=1,Message=Query [Stmt:{D6C26B5F-40A2-47AC-BD00-EF1949CE0F07}]_[DQHash:0xE0D603462A093F5B]_[Sch:4b43f990-a50e-4f75-8848-38ec979fbcae]_[Query:{D787942E-C61C-45F1-A2D2-45BED3937082}] failed:Query failed because of compute container failures or other transient errors, and retry exhausted. This can be an intermittent issue. Please retry later.,},{Class=0,Number=15885,State=1,Message=Statement ID: {D6C26B5F-40A2-47AC-BD00-EF1949CE0F07} | Query hash: 0xE0D603462A093F5B | Distributed request ID: {D787942E-C61C-45F1-A2D2-45BED3937082}. Total size of data scanned is 0 megabytes, total size of data moved is 0 megabytes, total size of data written is 0 megabytes.,},],'", "failureType": "UserError", "target": "Inventory Tables B", "details": "" }
{ "effectiveIntegrationRuntime": "AzureIntegrationRuntimeTTL (East US)", "executionDuration": 8, "durationInQueue": { "integrationRuntimeQueue": 0 }, "billingReference": { "activityType": "PipelineActivity", "billableDuration": [ { "meterType": "AzureIR", "duration": 0.016666666666666666, "unit": "Hours" } ] } }
-
GG 5 Reputation points
2024-01-02T22:09:26.33+00:00 I am having the same issue, as well. Submitted a support ticket.
Sign in to comment