INF: Limitations for SQL Agent when you have many Jobs running in SQL simultaneously

Recently I have had a customer complaining that when he is adding more subscriptions to his publication, distribution agent job for the newly added subscriptions would not start.  And when he tried to start the distribution agent job manually, it failed with below error

SQLServerAgent Error: Request to run job Job_id (from User distributor_admin) refused because the job is already running from a request by User distributor_admin. Changed database context to 'db_name'. (Microsoft SQL Server, Error: 22022)

But when he checked the subscription status in Replication Monitor, it says NOT RUNNING. Quite Weird!!

This is how the troubleshooting began:

  • To further isolate the issue tried to start distrib.exe with appropriate parameters from command prompt, it started successfully. This step clearly indicated that there are no issues with distrib.exe but most likely SQL Agent Job issue.
  • When further checked we found that there were about 155 jobs and most of them were log reader and distribution agent jobs. All of them were set to run continuously.
  • This led us to the design limitation for SQL server agent. We cannot start more than 60 jobs in the same 1 minute interval

Limitation: one second delay between starting jobs

<<From KB 306457>>

A design limitation imposes a one second delay between jobs. This limitation was set so that the job scheduler does not monopolize the resources on the hosting server.
Because of this limitation, up to 60 jobs can be started in the same one-minute interval. If jobs run and finish in less than one minute and are started at the same time based on a one-minute schedule, some jobs may never run. The jobs may never run because jobs that were already executed are retriggered every minute. Other jobs are left in a starvation situation.

Important: This information applies to starting news jobs and does not impose a limitation on jobs that are running at the same time.  We recommend that you group jobs into batches of less than 60 simultaneous jobs and stage schedules so that there is enough bandwidth for SQL Server Agent to run these jobs.

<</From KB 306457>>

Resolution

So for all the new subscriptions that customer added, we created a different schedule so that there is enough bandwidth for SQL Server Agent to run these jobs.

Additional Info: https://support.microsoft.com/kb/306457

Written By:
Prema Kolli - Support Engineer,Microsoft GTSC

Reviewed By:
Akbar Farishta - Technical Lead, Microsoft GTSC
Ouseph Devis T - Technical Lead, Microsoft GTSC