Asynchronous request reply pattern for Cosmos using Python

Question

Asynchronous request reply pattern for Cosmos using Python

Chand, Anupam SBOBNG-ITA/RX 466

Hi,

We are building a python API to read data from a Cosmos DB. We are expecting a high concurrency of requests between 500-1000.
The API has been built on Azure function using a premium APP service P1V2.
When we tested a low concurrency, the response time were consistent and within our expectations. However, when we started doing load testing, we found that the response times gradually went up ranging from 2s all the way to 60s.
The CPU% and memory did not show much changes. Even the http queue length was low <10. What we understood was that somehow the requests were getting queued up and being served eventually but with a delay. We are using a Singleton for Cosmos Client and have setup the parameters on the App config to allow for maximum concurrency for python apps.

The questions I had is as follows :

What metric should we use to scale the app service up automatically? The CPU, memory and http queue length was still low. The only metric which showed a significant jump was the sockets count for inbound requests. Is this what we need to use to auto scale our App service?
I also read that we need to follow asynchronous request/reply pattern. However, I found an existing issue that Cosmos SDK for python does not support async method (https://github.com/Azure/azure-sdk-for-python/issues/8636). Is this true? If yes, then what is the alternative to be followed for our use case? Is there some other method to read cosmos in async mode?

Any help would be appreciated.

Regards,
Anupam

Saurabh Sharma 23,846 Reputation points Microsoft Employee Moderator

2021-06-16T01:20:24.3+00:00

Hi anonymous user,

Thanks for using Microsoft Q&A !!
I am checking internally and get back to you on the same.

Thanks
Saurabh

Accepted answer

0 additional answers

Your answer

Saurabh Sharma 23,846 Reputation points Microsoft Employee Moderator

2021-06-16T01:20:24.3+00:00

Hi anonymous user,

Thanks for using Microsoft Q&A !!
I am checking internally and get back to you on the same.

Thanks
Saurabh

Answer 1

Saurabh Sharma 23,846 Microsoft Employee Moderator

Hi anonymous user,

Here are updates which I have received internally -

There’s not enough detail to be sure if this is present in your scenario, but we had a scenario other user was difficulty scale testing due to Client Affinity being enabled at their load balancer and so all the load testing clients were bound to the first couple of instances and the new scaled-out instances went unused. If you look at the individual CPU usage per app service instance and they’re severely unbalanced, you might be affected by it.
Yes, it’s a known limitation and there is no ETA or workaround for that right now. How many rows/documents each API call will return? How much data?
I believe for point reads it shouldn’t be a problem and you would adjust the RUs to support the necessary concurrency.

Please let me know if you have any other questions.

Thanks
Saurabh

Chand, Anupam SBOBNG-ITA/RX 466 Reputation points

2021-06-18T04:37:01.18+00:00

Thanks for the response.
Saurabh Sharma 23,846 Reputation points Microsoft Employee Moderator

2021-06-18T05:15:47.677+00:00

Hi anonymous user,
No problem. Please let me know if you find above reply useful. If yes, do click on 'Accept the Answer' link in above reply. This will help other community members facing similar query to refer to this solution.

Thanks
Saurabh

Share via

Asynchronous request reply pattern for Cosmos using Python

0 additional answers

Your answer