Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article delivers a quick orientation with a startup lens across five foundational platform pillars: compute, networking, storage, containers, and data. For each pillar, you get a short decision table that maps your situation to a default service, plus a note on when to revisit that choice as your startup scales.
In this article, you learn how to
- Pick the right Azure compute, networking, and storage primitives for your stage.
- Decide whether Azure Kubernetes Service is the right container platform for your stage, and what to use instead if it isn't.
- Pick a first data platform across relational, document, vector, and analytics workloads.
- Apply a small set of cost, reliability, and security defaults that hold up as you grow.
- Recognize the signals that tell you a default choice has outgrown its usefulness.
Prerequisites
- An active Azure subscription.
- The Azure CLI installed and signed in. To sign in, run
az login. - Owner or Contributor access on at least one resource group.
- Familiarity with the Azure portal landing page is helpful but not required.
- Basic familiarity with Azure fundamentals (regions, subscriptions, and resource groups).
Why this matters for startups
At an early-stage startup, the cost of a wrong infrastructure choice isn't the bill. It's the week you lose migrating off the wrong service after you've built around it. The five pillars in this article are the ones where a default chosen badly tends to compound: the wrong compute platform shapes your deployment pipeline; the wrong database constrains your data model; the wrong network topology blocks your first enterprise customer. You don't need a platform team, a site reliability engineer, or a financial operations specialist to make these choices well. You need a default that's good enough to ship on, and a clear signal that tells you when to revisit it. If your startup is in the Microsoft for Startups program, the same defaults stretch your credits further and keep you eligible for advanced benefits later.
Compute: where your code runs
Azure has more than a dozen compute services. The good news: for most early-stage workloads, three of them cover what you need.
| Your situation | Default Azure service | Why | Revisit when |
|---|---|---|---|
| Web app or HTTP API, one or two services, you want a managed runtime | Azure App Service (Linux) | No container build required. Built-in TLS, custom domains, deployment slots, and autoscale. Free and basic tiers are cheap enough to run a staging environment though slots and autoscale require Standard or higher. | You want to run more than a handful of services, need per-service scale, or need sidecars. |
| Event-driven function, scheduled job, or webhook handler | Azure Functions (Consumption plan) | Pay per execution. Scales to zero. Bindings remove most glue code for queues, blobs, and HTTP triggers. | Cold starts hurt your user-facing latency or you exceed Consumption plan limits. |
| Containerized microservices, you want an opinionated runtime without managing Kubernetes | Azure Container Apps | Built on Kubernetes with KEDA-based autoscaling, but you don't manage the cluster. Dapr is available as an optional integration. Scale-to-zero, revisions, and HTTPS ingress included. | You need cluster-level control, a custom scheduler, or advanced networking. |
| Long-running batch, GPU training, or lift-and-shift of an existing virtual machine workload | Azure Virtual Machines | Full operating system control. Use a virtual machine scale set when you need horizontal scale. | Operational overhead of patching and image management starts to slow shipping. |
| You're sure you need Kubernetes (see section 4 before assuming this) | Azure Kubernetes Service | Managed control plane. Fits teams that already have Kubernetes experience or specific platform requirements. | See the Containers section for AKS decision criteria. |
Tip
Start with App Service for your first user-facing web app and Azure Functions for everything event-driven. You can move to Azure Container Apps or Azure Kubernetes Service later without changing your application code, if you keep the application stateless and write configuration to environment variables.
Choosing between Container Apps and App Service
Container Apps and App Service overlap. The honest tiebreaker: if your application already runs as a container image and you want per-service scale (different replicas for the web tier compared to the worker), Container Apps wins. If your application is a single web process and you don't want to maintain a Dockerfile, App Service wins.
Caution
Consider Azure Kubernetes Service when you have clear requirements that aren’t met by simpler options. While it offers strong flexibility and control, it also introduces additional operational considerations (like upgrades, node pool sizing, ingress configuration, and certificate management). If adopted too early, teams often find more time going into platform management than building product features.
Networking: what to set up on day one
Most early-stage Azure workloads don't need a virtual network on day one. App Service, Functions, Container Apps, and most managed databases give you public endpoints with TLS that are safe to expose, as long as you set authentication correctly. Adding networking complexity before you have a reason is the most common premature optimization on Azure.
| Your situation | Default approach | Why | Revisit when |
|---|---|---|---|
| Brand new app, public web traffic, no compliance requirements yet | Use the public endpoint with TLS. No virtual network. | Lowest operational overhead. App Service, Container Apps, and managed databases handle TLS for you. Use Microsoft Entra ID for authentication. | Your first enterprise customer asks for private connectivity. |
| You need a private connection between your app and a managed database | Virtual network integration on the compute side, private endpoint on the database side | Traffic stays on Microsoft's backbone. No public exposure for the database. Same managed service, no application change. | Day one if you handle protected data; otherwise when an audit or customer asks. |
| You need a single public entry point that fronts multiple back-ends with routing, TLS termination, and a web application firewall | Azure Front Door (global) or Azure Application Gateway (regional) | Front Door adds a global content delivery network and edge caching. Application Gateway is the regional, virtual-network-native option. | You've outgrown App Service's built-in TLS and routing. |
| You need outbound static IP addresses (a payment processor allowlist, for example) | NAT gateway attached to your virtual network | Predictable outbound IP. Required by many third-party APIs. | A vendor requires it. Don't add it speculatively. |
| Multi-region or multi-account topology | Virtual network peering or Azure Virtual WAN | Real network architecture starts here. Out of scope for most Explore-stage teams. | Multi-region is a real requirement, not an aspiration. |
Important
Lock down Microsoft Entra ID and your subscription role assignments before you worry about network isolation. Most Azure security incidents at small companies come from an over-permissioned identity, not a network exposure. Use Microsoft Entra ID groups for engineering access, and don't hand out Owner at the subscription scope.
Storage: blobs, files, and disks
Azure Storage is a single resource type (the storage account) that exposes four data services: blobs, files, queues, and tables. For application storage decisions, you're almost always picking between blobs (object storage), files (managed file shares), and managed disks (block storage attached to a virtual machine).
| What you're storing | Default Azure service | Why | Revisit when |
|---|---|---|---|
| User-uploaded files, generated reports, logs, model artifacts, backups | Azure Blob Storage (hot tier) | Object storage. Cheap, durable, scales to petabytes. Use cool or archive tiers later for data you don't read often. | You need POSIX file semantics or random read-write into a single file from multiple machines. |
| A shared file system mounted by multiple virtual machines or containers | Azure Files (Standard) or Azure NetApp Files (high-throughput) | Server Message Block (SMB) or Network File System (NFS) shared volumes. Avoid using these for content that fits the blob model. | You start using a file share as a queue or a database. Move to the right primitive. |
| Disks for a virtual machine | Premium SSD v2 managed disks | Tunable performance, good price-performance for application disks. Premium SSD v2 cannot be used as the OS disk; pair it with Premium SSD (v1) or Standard SSD for the OS. Standard SSD is acceptable for low-throughput workloads. | You need shared block storage across virtual machines (use Azure Elastic SAN or Azure NetApp Files). |
| Static website assets (single-page application bundle, marketing site, documentation) | Azure Storage static website hosting + Azure Front Door | Static Web Apps is the modern default: built-in custom domains, free managed TLS, global distribution, GitHub Actions CI/CD, and built-in authentication. Storage static website + Front Door still works for very low-cost setups but doesn't natively support custom headers or auth. | You add server-rendered pages. Move to App Service or Container Apps. |
Note
Storage accounts have a soft limit of 250 per region per subscription (raisable to 500 by request). That's plenty for early-stage teams. The mistake to avoid is creating one storage account per microservice; group by environment (production, staging, development) and by access pattern (hot, cold, archive) instead.
A note on backups
Azure Backup and storage account redundancy options (Locally Redundant Storage, Zone Redundant Storage, Geo Redundant Storage) are tunable per account and per disk. Locally Redundant Storage (LRS) is fine for development and staging. Use Zone Redundant Storage (ZRS) for production data. Geo Redundant Storage adds cost and isn't a substitute for application-level disaster recovery.
Containers and Azure Kubernetes Service
Azure has three ways to run containers in production: Azure Container Apps, Azure Container Instances, and Azure Kubernetes Service. They map to different team sizes and operational appetites.
| Your situation | Default Azure service | Why | When it starts to hurt |
|---|---|---|---|
| You have container images and you want a managed runtime with HTTPS ingress, scale-to-zero, and revisions | Azure Container Apps | Serverless platform on Kubernetes with KEDA autoscaling, but you don't see or manage the cluster. Pay for what runs. Good fit until you hit cluster-level requirements. Dapr is available as an opt-in integration. | You need custom schedulers, advanced networking (multiple network interface cards, custom Container Network Interface plugins), or specific Kubernetes operators. |
| You want to run a single container as a one-off job or a short batch | Azure Container Instances | Fastest path from image to running container. No orchestration. Charged per second of runtime. | You need anything resembling a service mesh or autoscaling beyond a single container. |
| You already operate Kubernetes elsewhere, or your application architecture genuinely requires it | Azure Kubernetes Service | Managed control plane. Bring your own node pools, network plugin, ingress controller, and observability stack. | Day one. Plan for ongoing upgrades (minor version released every 4 months), node pool tuning, and certificate management. |
| You're unsure whether you need Kubernetes | Container Apps for now | You can rebuild on Azure Kubernetes Service later if needed. Lifting a stateless containerized application is days of work, not weeks. | You have a concrete need (operator ecosystem, cluster-level policy) you can name. "Future-proofing" isn't a concrete need. |
When to graduate to AKS
Move to Azure Kubernetes Service (AKS) when at least two of these are true:
- You run more than ten services with shared lifecycle and networking concerns.
- You need custom controllers, sidecars, or Custom Resource Definitions (CRDs) that Container Apps doesn't expose.
- You require deep virtual network integration with strict policy enforcement.
- You're standardizing on a Kubernetes-based open source ecosystem (Argo, Istio, KEDA, and so on).
If you adopt AKS, follow the AKS Baseline architecture. The Microsoft Azure Well-Architected Framework and the AKS Baseline reference together cover the security, scaling, and upgrade defaults you want.
AKS defaults for a small team
| Setting | Default | Why |
|---|---|---|
| Node size | Standard_D4s_v5 system pool, Standard_D8s_v5 user pool | Predictable price-to-performance for general workloads |
| Cluster autoscaler | Enabled | Avoid paying for idle nodes |
| Workload Identity | Enabled | Replaces pod identity, integrates with Microsoft Entra ID |
| Azure Policy add-on | Enabled | Free guardrails (no privileged containers, required labels) |
| Container insights | Enabled | First-class metrics and logs in Azure Monitor |
| Private cluster | On for production | Control plane reachable only from the VNet |
Azure Container Registry
Regardless of which compute platform you pick, store your images in Azure Container Registry. The Basic tier is enough for early-stage teams. Use a separate registry per environment (production, staging) if you want hard isolation, or a single registry with separate repositories and role-based access control if you want simplicity.
Data platform: relational, document, vector, analytics
Data platform decisions are the ones most likely to be permanent. The schema you ship in month one shapes every feature for the next two years. Pick a default that's flexible enough to grow with the product, and resist the temptation to pre-pick a specialized database for a feature you haven't built yet.
| Your workload | Default Azure service | Why | Revisit when |
|---|---|---|---|
| Transactional application data (users, orders, content) with a known relational schema | Azure Database for PostgreSQL (Flexible Server) | Mature, widely understood, strong extension ecosystem (including pgvector for embeddings). Burstable tier is cheap enough for development and staging. | Multi-region write or hyper-scale read patterns. Consider Azure Cosmos DB for PostgreSQL. |
| Schema-flexible operational data, global distribution, predictable single-digit-millisecond reads | Azure Cosmos DB (NoSQL API) | Multi-region by default. Serverless tier is cheap enough to start on. Partition design matters; read the partition key guidance before you ship. | You're forcing relational joins through the application layer. PostgreSQL is probably the right answer. |
| Search across structured and unstructured content, including retrieval-augmented generation | Azure AI Search | Hybrid keyword and vector search. Integrates with Azure OpenAI Service and Cosmos DB. Free tier exists for prototyping. | You exceed the per-tier index limits (Standard 1 is a common upgrade point). |
| Vector embeddings for a retrieval-augmented generation feature | Start with pgvector in PostgreSQL or Azure AI Search | Avoid a separate vector database for the first version of a retrieval feature. You'll learn what you actually need (filtering, hybrid search, scale) from real usage. | You've characterized your read patterns and the constraints justify a specialized engine. |
| Analytics, reporting, and ad-hoc queries over production data | Azure Database for PostgreSQL read replica (Explore), Microsoft Fabric (Expand and Extract) | A read replica is enough for most Explore-stage analytics. Microsoft Fabric is the modern analytics platform once you outgrow that. | Your read replicas can't keep up, or business stakeholders need a self-serve analytics surface. |
| Caching layer in front of a database | Azure Cache for Redis (Basic tier) | Standard caching primitive. Cheap to add later; don't add speculatively. | You see a clear hot read pattern that's saturating the database. Measure before adding. |
Important
Pick one default database and stay on it for as long as you can. A team that runs PostgreSQL, Cosmos DB, Redis, AI Search, a queue, and a graph database at fifteen engineers has accidentally bought itself a platform team's worth of work.
Where Azure OpenAI Service fits
Azure OpenAI Service is not a data platform, but it shares the same decision rhythm. Most startups building a generative AI feature start with one model deployment (a recent chat completion model) in a single region, plus AI Search or pgvector for retrieval. You don't need a dedicated fine-tuning pipeline, a model gateway, or multiple deployments until usage tells you to add them.
What this article covers (and what it does not)
| Topic | In this article | When to add it |
|---|---|---|
| Identity and access management beyond the basics | No | Day one for Microsoft Entra ID setup. Conditional access and Privileged Identity Management when you have an information security review. |
| Infrastructure as Code (Bicep, Terraform) | No | When manual portal changes start to drift between environments. Usually around the time you add staging. |
| Continuous integration and continuous deployment pipelines | No | Day one. GitHub Actions or Azure DevOps Pipelines are both fine. |
| Observability (logs, metrics, traces) | No | Application Insights from day one. Azure Monitor workbooks when you have alert fatigue. |
| Cost management | No | Set a subscription-level budget on day one. Tag resources with environment and owner from the start. |
| Compliance (SOC 2, ISO 27001, HIPAA) | No | When a customer asks. Microsoft Defender for Cloud has a compliance dashboard that maps controls to Azure resources. |
| Disaster recovery and multi-region | No | When the cost of an hour of downtime exceeds the engineering cost of the second region. |
When the platform defaults are no longer enough
These growth signals tell you that a specific default needs a more deliberate replacement:
- You've deployed more than five distinct services on App Service or Container Apps and per-service scale is becoming a daily concern. Look at Azure Kubernetes Service.
- Your monthly Azure bill is growing faster than your monthly revenue for two months in a row. Time for a Cost Management review and Reserved Instance or Savings Plan analysis.
- Your virtual network now spans multiple subscriptions or regions. Look at Azure Virtual WAN and a hub-and-spoke topology.
- A single PostgreSQL instance can't hold your working set in memory and read replicas don't close the gap. Look at Cosmos DB for PostgreSQL or a sharded architecture.
- Analytics queries on the production database are noticeably affecting application latency. Move analytics to Microsoft Fabric.
- You're running more than two storage accounts per environment for the same access pattern. Consolidate.
- You've added a third country with paying customers. Time to evaluate a second region, geo-redundant storage, and a Front Door routing strategy.
Note
Resist the temptation to adopt enterprise platform tooling early. Most of the preceding patterns (service mesh, multi-region active-active, financial operations tooling, custom Kubernetes operators) add operational surface area that pays off only at scale. Add them when you have the team to maintain them, not before.
Reference checklist
Run through this once a month for the first six months on Azure. It catches the most common drift.
- One subscription per environment (production, staging, development), or one subscription with strict resource groups per environment. Don't mix.
- Every resource is tagged with environment, owner, and cost center (even if cost center is the same value for everything today).
- A subscription-level budget with alerts at 50%, 80%, and 100% of the monthly target in Cost Management.
- Microsoft Entra ID groups, not individuals, hold role assignments on resource groups. No standing Owner at the subscription scope.
- Application Insights or Azure Monitor is enabled on every production compute resource.
- Production database backups are verified by a documented restore test (at least once).
- Secrets are in Azure Key Vault, not in application configuration. Use managed identities for the compute-to-Key-Vault path.
- Container images are scanned (Microsoft Defender for Containers or your registry's built-in scanner).