@Mohankumar the actual time of analyze operation in this case is dependent on many factors. Here is the official documentation on latency for DI analyze requests.
Latency is the amount of time it takes for an API server to handle and process an incoming request and deliver the outgoing response to the client. The time to analyze a document depends on the size (for example, number of pages) and associated content on each page. Document Intelligence is a multitenant service where latency for similar documents is comparable but not always identical. Occasional variability in latency and performance is inherent in any microservice-based, stateless, asynchronous service that processes images and large documents at scale. Although we're continuously scaling up the hardware and capacity and scaling capabilities, you might still have latency issues at runtime.
Since this is an async operation the time the analyze is taking is dependent on many factors like scaling at resource or region level and this might be taking more time until the poller indicates a result is available. Also, the service does not support cancellation of the long running operation and returns with an error message indicating absence of cancellation support. You might have to rewrite your application logic to display the result only after a poller returns a result instead of waiting for the result as soon as the request is made to analyze. Thanks!!
Disconnected containers with commitment plan can be used which can reduce latency but there is no published limits for metrics since it is totally dependent on host or client configuration. I would recommend trying the connected container scenario and check out if it offers lower times than cloud connected service. Thanks!!