Share via

How to scale out to more workers for a fan-out / fan-in durable function?

Saxon Druce 25 Reputation points
2023-10-11T01:42:02.29+00:00

Hi,

I'm using durable functions and have a single orchestration function which fans out 100 activity functions in parallel, then collates the results.

Each activity does some heavy numeric processing, running for about a minute at 100% CPU single threaded.

Ideally, all 100 tasks would run in parallel and so the final output of the orchestration could be available in one minute (plus some overhead). I expect it to take a while for the workers to be scaled up and so it won't be quite as perfectly parallel as 100 tasks in 1 minute. However I am only seeing about 20 workers being allocated for the tasks, and so overall it takes about 5 minutes to get through the tasks.

Is there something I can do to allow the workers to scale up more quickly to meet the number of tasks?

I am using the Consumption plan, and so since each worker only has a single core, there isn't any benefit to allocating multiple tasks to a single worker at a time - each is using 100% CPU and so they will just cause CPU contention with each other. So I have set maxConcurrentActivityFunctions and maxConcurrentOrchestratorFunctions to 1, which did seem to help a bit.

I've run tests with varying numbers of tasks, and always end up with less workers than tasks, eg:

10 tasks: about 5 workers, about 2-3 minutes total run time
100 tasks: about 20 workers, about 5 minutes total run time
200 tasks: about 30 workers, about 7 minutes total run time

The following shows the number of workers I saw for 5 runs of the orchestration at each of these task sizes:

User's image

Is there anything else I can adjust to improve the throughput of the tasks?

Thanks,
Saxon

Azure Functions
Azure Functions

An Azure service that provides an event-driven serverless compute platform.

0 comments No comments

Answer accepted by question author
  1. Pramod Valavala 20,661 Reputation points Microsoft Employee Moderator
    2023-10-13T14:25:59.97+00:00

    @Saxon Druce Scaling out of function instances is done by the scale controller which uses heuristics for each trigger type but as such doesn't have a way to configure. The new target-based scaling looks promising but doesn't support Durable Functions now.

    As the documentation for the scale controller mentions, the check to scale out is done every 30 seconds and since your activity functions run for at-most a minute, the scale out decisions are not made to expand to more instances.

    The settings that you have made are the best possible for the consumption tier.

    But if you do require your runs to complete as fast as possible, then your option would be to move to a higher tier like premium where you the scaling decisions are made faster and also you have control to temporarily scale out using the API directly as well.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.