Chosing the right Azure Region for services

Raunak Agarwal 25 Reputation points
2025-03-18T13:00:35.7766667+00:00

We are an AI startup building our AI accelerators on Azure and are evaluating the best region for deploying our services. Below is the list of Azure services we are using:

  • FastAPI – Python-based API framework with async support
  • Azure Document Intelligence – OCR and document structure extraction
  • Azure OpenAI – Embeddings and text generation capabilities
  • Azure Cosmos DB – Vector database for storing and retrieving document chunks
  • Azure Data Lake Storage – Secure storage for uploaded document files
  • Phi-3 Model (hosted on Azure VMs) – Open-source LLM for document comparison
  • LangChain Framework – Chunking, document categorization, dynamic question suggestion
  • JWT – Token-based authentication mechanism

Key Considerations:

  • Our development team is based in India, and our clients are across India, the UK, and the Middle East.
  • We are looking for a cost-efficient yet high-performance deployment strategy.
  • Azure OpenAI is not available in Central India, so we are considering South India as an alternative.
  • Should we co-locate Cosmos DB with OpenAI for better performance?
  • Are there any compute availability constraints (especially for GPU-based workloads like Phi-3) in Central India vs. South India?
  • Would you recommend a multi-region setup (e.g., India as primary, UK South as fallback) or a single-region deployment?
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,913 questions
{count} vote

Accepted answer
  1. Pavankumar Purilla 6,290 Reputation points Microsoft External Staff
    2025-03-19T00:28:59.5233333+00:00

    Hi Raunak Agarwal,

    For your AI startup, South India is the best primary Azure region since Azure OpenAI is unavailable in Central India, ensuring lower latency for India-based clients while remaining cost-effective. Co-locating Cosmos DB with OpenAI in South India is recommended to minimize query latency and data transfer costs. For GPU-based workloads (Phi-3), South India should be the first choice, but UK South or East US can serve as a backup due to potential GPU availability constraints.

    A multi-region setup is advisable, with South India as primary and UK South as fallback, ensuring high availability and reduced latency for UK users via Azure Front Door or Traffic Manager. Cost optimization strategies include Azure Reserved Instances for GPUs, auto-scaling for FastAPI, and monitoring egress costs. If budget is a constraint, you can start with a single-region deployment in South India and scale to multi-region later as needed.

    I hope this information helps. Thank you!

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.