Dramatic latency increase after 10, Oct for OpenAI GPT-4 32k
We originally shifted from OpenAI's API to the Azure-hosted OpenAI models because of a significantly lower latency with almost real-time usability (ranging from a couple 100s ms to 2s). But starting from roughly 10, Oct the latency increased almost six-fold on average which make some of our features unusable. We essentially need to rethink the product for some features to avoid running into UX issues. Our instance is hosted in Switzerland.
Any idea why this is? Anything that changed or any settings that were added that we can modify to speed it up again? Or is it solely because of increasing demand for the service (which is unlikely considering the step-like jump)?