Azure Platform Foundations for Startups

This article delivers a quick orientation with a startup lens across five foundational platform pillars: compute, networking, storage, containers, and data. For each pillar, you get a short decision table that maps your situation to a default service, plus a note on when to revisit that choice as your startup scales.

In this article, you learn how to

Pick the right Azure compute, networking, and storage primitives for your stage.
Decide whether Azure Kubernetes Service is the right container platform for your stage, and what to use instead if it isn't.
Pick a first data platform across relational, document, vector, and analytics workloads.
Apply a small set of cost, reliability, and security defaults that hold up as you grow.
Recognize the signals that tell you a default choice has outgrown its usefulness.

Prerequisites

An active Azure subscription.
The Azure CLI installed and signed in. To sign in, run az login.
Owner or Contributor access on at least one resource group.
Familiarity with the Azure portal landing page is helpful but not required.
Basic familiarity with Azure fundamentals (regions, subscriptions, and resource groups).

Why this matters for startups

At an early-stage startup, the cost of a wrong infrastructure choice isn't the bill. It's the week you lose migrating off the wrong service after you've built around it. The five pillars in this article are the ones where a default chosen badly tends to compound: the wrong compute platform shapes your deployment pipeline; the wrong database constrains your data model; the wrong network topology blocks your first enterprise customer. You don't need a platform team, a site reliability engineer, or a financial operations specialist to make these choices well. You need a default that's good enough to ship on, and a clear signal that tells you when to revisit it. If your startup is in the Microsoft for Startups program, the same defaults stretch your credits further and keep you eligible for advanced benefits later.

Compute: where your code runs

Azure has more than a dozen compute services. The good news: for most early-stage workloads, three of them cover what you need.

Your situation	Default Azure service	Why	Revisit when
Web app or HTTP API, one or two services, you want a managed runtime	Azure App Service (Linux)	No container build required. Built-in TLS, custom domains, deployment slots, and autoscale. Free and basic tiers are cheap enough to run a staging environment though slots and autoscale require Standard or higher.	You want to run more than a handful of services, need per-service scale, or need sidecars.
Event-driven function, scheduled job, or webhook handler	Azure Functions (Consumption plan)	Pay per execution. Scales to zero. Bindings remove most glue code for queues, blobs, and HTTP triggers.	Cold starts hurt your user-facing latency or you exceed Consumption plan limits.
Containerized microservices, you want an opinionated runtime without managing Kubernetes	Azure Container Apps	Built on Kubernetes with KEDA-based autoscaling, but you don't manage the cluster. Dapr is available as an optional integration. Scale-to-zero, revisions, and HTTPS ingress included.	You need cluster-level control, a custom scheduler, or advanced networking.
Long-running batch, GPU training, or lift-and-shift of an existing virtual machine workload	Azure Virtual Machines	Full operating system control. Use a virtual machine scale set when you need horizontal scale.	Operational overhead of patching and image management starts to slow shipping.
You're sure you need Kubernetes (see section 4 before assuming this)	Azure Kubernetes Service	Managed control plane. Fits teams that already have Kubernetes experience or specific platform requirements.	See the Containers section for AKS decision criteria.

Tip

Start with App Service for your first user-facing web app and Azure Functions for everything event-driven. You can move to Azure Container Apps or Azure Kubernetes Service later without changing your application code, if you keep the application stateless and write configuration to environment variables.

Choosing between Container Apps and App Service

Container Apps and App Service overlap. The honest tiebreaker: if your application already runs as a container image and you want per-service scale (different replicas for the web tier compared to the worker), Container Apps wins. If your application is a single web process and you don't want to maintain a Dockerfile, App Service wins.

Caution

Consider Azure Kubernetes Service when you have clear requirements that aren’t met by simpler options. While it offers strong flexibility and control, it also introduces additional operational considerations (like upgrades, node pool sizing, ingress configuration, and certificate management). If adopted too early, teams often find more time going into platform management than building product features.

Networking: what to set up on day one

Most early-stage Azure workloads don't need a virtual network on day one. App Service, Functions, Container Apps, and most managed databases give you public endpoints with TLS that are safe to expose, as long as you set authentication correctly. Adding networking complexity before you have a reason is the most common premature optimization on Azure.

Your situation	Default approach	Why	Revisit when
Brand new app, public web traffic, no compliance requirements yet	Use the public endpoint with TLS. No virtual network.	Lowest operational overhead. App Service, Container Apps, and managed databases handle TLS for you. Use Microsoft Entra ID for authentication.	Your first enterprise customer asks for private connectivity.
You need a private connection between your app and a managed database	Virtual network integration on the compute side, private endpoint on the database side	Traffic stays on Microsoft's backbone. No public exposure for the database. Same managed service, no application change.	Day one if you handle protected data; otherwise when an audit or customer asks.
You need a single public entry point that fronts multiple back-ends with routing, TLS termination, and a web application firewall	Azure Front Door (global) or Azure Application Gateway (regional)	Front Door adds a global content delivery network and edge caching. Application Gateway is the regional, virtual-network-native option.	You've outgrown App Service's built-in TLS and routing.
You need outbound static IP addresses (a payment processor allowlist, for example)	NAT gateway attached to your virtual network	Predictable outbound IP. Required by many third-party APIs.	A vendor requires it. Don't add it speculatively.
Multi-region or multi-account topology	Virtual network peering or Azure Virtual WAN	Real network architecture starts here. Out of scope for most Explore-stage teams.	Multi-region is a real requirement, not an aspiration.

Important

Lock down Microsoft Entra ID and your subscription role assignments before you worry about network isolation. Most Azure security incidents at small companies come from an over-permissioned identity, not a network exposure. Use Microsoft Entra ID groups for engineering access, and don't hand out Owner at the subscription scope.

Storage: blobs, files, and disks

Azure Storage is a single resource type (the storage account) that exposes four data services: blobs, files, queues, and tables. For application storage decisions, you're almost always picking between blobs (object storage), files (managed file shares), and managed disks (block storage attached to a virtual machine).

What you're storing	Default Azure service	Why	Revisit when
User-uploaded files, generated reports, logs, model artifacts, backups	Azure Blob Storage (hot tier)	Object storage. Cheap, durable, scales to petabytes. Use cool or archive tiers later for data you don't read often.	You need POSIX file semantics or random read-write into a single file from multiple machines.
A shared file system mounted by multiple virtual machines or containers	Azure Files (Standard) or Azure NetApp Files (high-throughput)	Server Message Block (SMB) or Network File System (NFS) shared volumes. Avoid using these for content that fits the blob model.	You start using a file share as a queue or a database. Move to the right primitive.
Disks for a virtual machine	Premium SSD v2 managed disks	Tunable performance, good price-performance for application disks. Premium SSD v2 cannot be used as the OS disk; pair it with Premium SSD (v1) or Standard SSD for the OS. Standard SSD is acceptable for low-throughput workloads.	You need shared block storage across virtual machines (use Azure Elastic SAN or Azure NetApp Files).
Static website assets (single-page application bundle, marketing site, documentation)	Azure Storage static website hosting + Azure Front Door	Static Web Apps is the modern default: built-in custom domains, free managed TLS, global distribution, GitHub Actions CI/CD, and built-in authentication. Storage static website + Front Door still works for very low-cost setups but doesn't natively support custom headers or auth.	You add server-rendered pages. Move to App Service or Container Apps.

Note

Storage accounts have a soft limit of 250 per region per subscription (raisable to 500 by request). That's plenty for early-stage teams. The mistake to avoid is creating one storage account per microservice; group by environment (production, staging, development) and by access pattern (hot, cold, archive) instead.

A note on backups

Azure Backup and storage account redundancy options (Locally Redundant Storage, Zone Redundant Storage, Geo Redundant Storage) are tunable per account and per disk. Locally Redundant Storage (LRS) is fine for development and staging. Use Zone Redundant Storage (ZRS) for production data. Geo Redundant Storage adds cost and isn't a substitute for application-level disaster recovery.

Containers and Azure Kubernetes Service

Azure has three ways to run containers in production: Azure Container Apps, Azure Container Instances, and Azure Kubernetes Service. They map to different team sizes and operational appetites.

Your situation	Default Azure service	Why	When it starts to hurt
You have container images and you want a managed runtime with HTTPS ingress, scale-to-zero, and revisions	Azure Container Apps	Serverless platform on Kubernetes with KEDA autoscaling, but you don't see or manage the cluster. Pay for what runs. Good fit until you hit cluster-level requirements. Dapr is available as an opt-in integration.	You need custom schedulers, advanced networking (multiple network interface cards, custom Container Network Interface plugins), or specific Kubernetes operators.
You want to run a single container as a one-off job or a short batch	Azure Container Instances	Fastest path from image to running container. No orchestration. Charged per second of runtime.	You need anything resembling a service mesh or autoscaling beyond a single container.
You already operate Kubernetes elsewhere, or your application architecture genuinely requires it	Azure Kubernetes Service	Managed control plane. Bring your own node pools, network plugin, ingress controller, and observability stack.	Day one. Plan for ongoing upgrades (minor version released every 4 months), node pool tuning, and certificate management.
You're unsure whether you need Kubernetes	Container Apps for now	You can rebuild on Azure Kubernetes Service later if needed. Lifting a stateless containerized application is days of work, not weeks.	You have a concrete need (operator ecosystem, cluster-level policy) you can name. "Future-proofing" isn't a concrete need.

When to graduate to AKS

Move to Azure Kubernetes Service (AKS) when at least two of these are true:

You run more than ten services with shared lifecycle and networking concerns.
You need custom controllers, sidecars, or Custom Resource Definitions (CRDs) that Container Apps doesn't expose.
You require deep virtual network integration with strict policy enforcement.
You're standardizing on a Kubernetes-based open source ecosystem (Argo, Istio, KEDA, and so on).

If you adopt AKS, follow the AKS Baseline architecture. The Microsoft Azure Well-Architected Framework and the AKS Baseline reference together cover the security, scaling, and upgrade defaults you want.

AKS defaults for a small team

Setting	Default	Why
Node size	Standard_D4s_v5 system pool, Standard_D8s_v5 user pool	Predictable price-to-performance for general workloads
Cluster autoscaler	Enabled	Avoid paying for idle nodes
Workload Identity	Enabled	Replaces pod identity, integrates with Microsoft Entra ID
Azure Policy add-on	Enabled	Free guardrails (no privileged containers, required labels)
Container insights	Enabled	First-class metrics and logs in Azure Monitor
Private cluster	On for production	Control plane reachable only from the VNet

Azure Container Registry

Regardless of which compute platform you pick, store your images in Azure Container Registry. The Basic tier is enough for early-stage teams. Use a separate registry per environment (production, staging) if you want hard isolation, or a single registry with separate repositories and role-based access control if you want simplicity.

Data platform: relational, document, vector, analytics

Data platform decisions are the ones most likely to be permanent. The schema you ship in month one shapes every feature for the next two years. Pick a default that's flexible enough to grow with the product, and resist the temptation to pre-pick a specialized database for a feature you haven't built yet.

Your workload	Default Azure service	Why	Revisit when
Transactional application data (users, orders, content) with a known relational schema	Azure Database for PostgreSQL (Flexible Server)	Mature, widely understood, strong extension ecosystem (including pgvector for embeddings). Burstable tier is cheap enough for development and staging.	Multi-region write or hyper-scale read patterns. Consider Azure Cosmos DB for PostgreSQL.
Schema-flexible operational data, global distribution, predictable single-digit-millisecond reads	Azure Cosmos DB (NoSQL API)	Multi-region by default. Serverless tier is cheap enough to start on. Partition design matters; read the partition key guidance before you ship.	You're forcing relational joins through the application layer. PostgreSQL is probably the right answer.
Search across structured and unstructured content, including retrieval-augmented generation	Azure AI Search	Hybrid keyword and vector search. Integrates with Azure OpenAI Service and Cosmos DB. Free tier exists for prototyping.	You exceed the per-tier index limits (Standard 1 is a common upgrade point).
Vector embeddings for a retrieval-augmented generation feature	Start with pgvector in PostgreSQL or Azure AI Search	Avoid a separate vector database for the first version of a retrieval feature. You'll learn what you actually need (filtering, hybrid search, scale) from real usage.	You've characterized your read patterns and the constraints justify a specialized engine.
Analytics, reporting, and ad-hoc queries over production data	Azure Database for PostgreSQL read replica (Explore), Microsoft Fabric (Expand and Extract)	A read replica is enough for most Explore-stage analytics. Microsoft Fabric is the modern analytics platform once you outgrow that.	Your read replicas can't keep up, or business stakeholders need a self-serve analytics surface.
Caching layer in front of a database	Azure Cache for Redis (Basic tier)	Standard caching primitive. Cheap to add later; don't add speculatively.	You see a clear hot read pattern that's saturating the database. Measure before adding.

Important

Pick one default database and stay on it for as long as you can. A team that runs PostgreSQL, Cosmos DB, Redis, AI Search, a queue, and a graph database at fifteen engineers has accidentally bought itself a platform team's worth of work.

Where Azure OpenAI Service fits

Azure OpenAI Service is not a data platform, but it shares the same decision rhythm. Most startups building a generative AI feature start with one model deployment (a recent chat completion model) in a single region, plus AI Search or pgvector for retrieval. You don't need a dedicated fine-tuning pipeline, a model gateway, or multiple deployments until usage tells you to add them.

What this article covers (and what it does not)

Topic	In this article	When to add it
Identity and access management beyond the basics	No	Day one for Microsoft Entra ID setup. Conditional access and Privileged Identity Management when you have an information security review.
Infrastructure as Code (Bicep, Terraform)	No	When manual portal changes start to drift between environments. Usually around the time you add staging.
Continuous integration and continuous deployment pipelines	No	Day one. GitHub Actions or Azure DevOps Pipelines are both fine.
Observability (logs, metrics, traces)	No	Application Insights from day one. Azure Monitor workbooks when you have alert fatigue.
Cost management	No	Set a subscription-level budget on day one. Tag resources with environment and owner from the start.
Compliance (SOC 2, ISO 27001, HIPAA)	No	When a customer asks. Microsoft Defender for Cloud has a compliance dashboard that maps controls to Azure resources.
Disaster recovery and multi-region	No	When the cost of an hour of downtime exceeds the engineering cost of the second region.

When the platform defaults are no longer enough

These growth signals tell you that a specific default needs a more deliberate replacement:

You've deployed more than five distinct services on App Service or Container Apps and per-service scale is becoming a daily concern. Look at Azure Kubernetes Service.
Your monthly Azure bill is growing faster than your monthly revenue for two months in a row. Time for a Cost Management review and Reserved Instance or Savings Plan analysis.
Your virtual network now spans multiple subscriptions or regions. Look at Azure Virtual WAN and a hub-and-spoke topology.
A single PostgreSQL instance can't hold your working set in memory and read replicas don't close the gap. Look at Cosmos DB for PostgreSQL or a sharded architecture.
Analytics queries on the production database are noticeably affecting application latency. Move analytics to Microsoft Fabric.
You're running more than two storage accounts per environment for the same access pattern. Consolidate.
You've added a third country with paying customers. Time to evaluate a second region, geo-redundant storage, and a Front Door routing strategy.