Concurrency and API rate limits for Apache Spark pools in Azure Synapse Analytics
The following sections list various numerical limits for Spark pools and APIs to manage jobs in Azure Synapse Analytics.
The following table shows the maximum limits of jobs and cores for individual workspaces and Spark pools.
Important
The limits specified for the Spark pools are irrespective of their node sizes, vCore, and memory configurations and apply to across all created instances of a Spark Pool regardless of the user unless otherwise noted.
Resource | Metric | Limit | Scope | Regions | Notes |
---|---|---|---|---|---|
Jobs | Running Simultaneously | 50 | Spark Pool | All | Limit applies across all users of a Spark Pool definition. For example, if two users are submitting jobs against the same Spark Pool, then the cumulative number of jobs running for the two users cannot exceed 50. |
Jobs | Queued | 200 | Spark Pool | All | Limit applies across all users of a Spark Pool definition. |
Jobs | Maximum Active Jobs | 250 | Spark Pool | All | Limit applies across all users of a Spark Pool definition. |
Jobs | Maximum Active Jobs | 1000 | Workspace | All | |
Cores | Cores Limit Per User | Based on the Pool Definition | Spark Pool | All | For example, if a Spark pool is defined as a 50-core pool, each user can use up to 50 cores within the specific Spark pool, since each user gets its own instance of the pool. |
Cores | Cores Limit Across All Users | Based on Workspace Definition | Workspace | All | For example, if a workspace has a limit of 200 cores, then all users across all pools within the workspace cannot use more than 200 cores combined. |
Livy | Max Payload size for Livy request | 100kBytes | Livy | All |
Note
- Maximum Active Jobs is the total number of jobs submitted, which includes both
Jobs Running Simultaneously
andJobs Queued
, i.e.,Max Active Jobs = Jobs Running Simultaneously + Jobs Queued
The following table shows the throttling limits for the spark job and session management APIs.
Resource | Metric | Limit (Queries per Second) | Scope | Regions |
---|---|---|---|---|
Jobs API | Get Spark Session | 200 | Spark Session | All |
Jobs API | Get Spark Session | 200 | Spark Pool | All |
Jobs API | Get Spark Statement | 200 | Spark Session | All |
Jobs API | Get Multiple Spark Statements | 200 | Spark Session | All |
Jobs API | Create Session | 2 | Workspace | EastUS, EastUS2, WestUS, WestUS2, CentralUS, EastUS2EUAP, West Europe |
Jobs API | Create Session | 2 | Workspace | All other regions |
Jobs API | Create Batch Job | 2 | Workspace | All |
Jobs API | Get Spark Batch Job | 200 | Workspace | All |
Jobs API | Get Multiple Spark Batch Job | 200 | Workspace | All |
Note
The maximum requests limit for all resources and operations is 200 queries per second for all regions.
Tip
If you get an error message or HTTP 429 response that reads
Your request has hit layered throttling rate-limit of 200 requests per 1 second(s) for requests on resource(s) identified by pattern {subscriptionId}. {workspaceName}. {HTTP-Verb}. {operationName} - You are currently hitting at a rate of 282 requests per 1 second(s). Please retry after 1 second(s)
Or
Your request has hit layered throttling rate-limit of 2 requests per 1 second(s) for requests on resource(s) identified by {subscriptionId}. {workspaceName}. {HTTP-Verb}. {operationName} - You are currently hitting at a rate of 24 requests per 1 second(s). Please retry after 1 second(s)
User should use the time period value provided in the "Retry-After" HTTP response header, to wait for that time interval when performing retries. In high traffic scenarios, using a random, constant or exponential time interval for the retries would still result in HTTP 429 failures and will incur in high number of retries,there by increase the overall time taken for the requests to get accepted by the service.
Instead by using the service provided Retry-After value, users would experience higher success rate in job submissions as the value in seconds is computed based on point in time traffic to optimize the number of retries and time taken for client's requests to be accepted by the server