編輯

共用方式為


Reliable Web App pattern for .NET

Azure App Service
Azure Front Door
Azure Cache for Redis
.NET

This article provides guidance on implementing the Reliable Web App pattern. This pattern outlines how to modify (replatform) web apps for cloud migration. It offers prescriptive architecture, code, and configuration guidance aligned with the principles of the Well-Architected Framework.

Why the Reliable Web App pattern for .NET?

The Reliable Web App pattern is a set of principles and implementation techniques that define how you should replatform web apps when migrating to the cloud. It focuses on the minimal code updates you need to make to be successful in the cloud. The following guidance uses the reference implementation as an example throughout and follows the replatform journey of the fictional company, Relecloud, to provide business context for your journey. Before implementing the Reliable Web App pattern for .NET, Relecloud had a monolithic, on-premises ticketing web app that used the ASP.NET framework.

Tip

GitHub logo There's reference implementation (sample) of the Reliable Web App pattern. It represents the end-state of the Reliable Web App implementation for a fictional company named Relecloud. It's a production-grade web app that features all the code, architecture, and configuration updates discussed in this article. Deploy and use the reference implementation to guide your implementation of the Reliable Web App pattern.

How to implement the Reliable Web App pattern

This article includes architecture, code, and configuration guidance to implement the Reliable Web App pattern. Use the following links to navigate to the specific guidance you need:

  • Business context: Align this guidance with your business context and learn how to define immediate and long term goals that drive replatforming decisions.
  • Architecture guidance: Learn how to select the right cloud services and design an architecture that meets your business requirements.
  • Code guidance: Implement three design patterns to improve the reliability and performance efficiency of your web app in the cloud: Retry, Circuit-Breaker, and Cache-Aside patterns
  • Configuration guidance: Configure authentication and authorization, managed identities, rightsized environments, infrastructure as code, and monitoring.

Business context

The first step in replatforming a web app is to define your business objectives. You should set immediate goals, such as service level objectives and cost optimization targets, as well as future goals for your web application. These objectives influence your choice of cloud services and the architecture of your web application in the cloud. Define a target SLO for your web app, such as 99.9% uptime. Calculate the composite SLA for all the services that affect the availability of your web app.

For example, Relecloud has a positive sales forecast and anticipates increased demand on their ticketing web app. To meet this demand, they defined the goals for the web application:

  • Apply low-cost, high-value code changes
  • Reach a service level objective (SLO) of 99.9%
  • Adopt DevOps practices
  • Create cost-optimized environments
  • Improve reliability and security

Relecloud's on-premises infrastructure wasn't a cost-effective solution to reach these goals. So, they decided that migrating their web application to Azure was the most cost effective way to achieve their immediate and future objectives.

Architecture guidance

The Reliable Web App pattern has a few essential architectural elements. You need DNS to manage endpoint resolution, a web application firewall to block malicious HTTP traffic, and a load balancer to protect and route inbound user requests. The application platform hosts your web app code and makes calls to all the backend services through private endpoints in a virtual network. An application performance monitoring tool captures metrics and logs to understand your web app.

Diagram showing the Essential architectural elements of the Reliable Web App pattern.

Figure 1. Essential architectural elements of the Reliable Web App pattern.

Design the architecture

Design your infrastructure to support your recovery metrics, such as recovery time objective (RTO) and recovery point objective (RPO). The RTO affects availability and must support your SLO. Determine a recovery point objective (RPO) and configure data redundancy to meet the RPO.

  • Choose infrastructure reliability. Determine how many availability zones and regions you need to meet your availability needs. Add availability zones and regions until the composite SLA meets your SLO. The Reliable Web App pattern supports multiple regions for an active-active or active-passive configuration. For example, the reference implementation uses an active-passive configuration to meet an SLO of 99.9%.

    For a multi-region web app, configure your load balancer to route traffic to the second region to support either an active-active or active-passive configuration depending on your business need. The two regions require the same services except one region has a hub virtual network that connects the regions. Adopt a hub-and-spoke network topology to centralize and share resources, such as a network firewall. If you have virtual machines, add a bastion host to the hub virtual network to manage them securely (see figure 2).

    Diagram showing the Reliable Web App pattern with a second region and a hub-and-spoke topology.

    Figure 2. The Reliable Web App pattern with a second region and a hub-and-spoke topology.

  • Choose a network topology. Choose the right network topology for your web and networking requirements. If you plan on having multiple virtual networks, use a hub and spoke network topology. It provides cost, management, and security benefits with hybrid connectivity options to on-premises and virtual networks.

Pick the right Azure services

When you move a web app to the cloud, you should select Azure services that meet your business requirements and align with the current features of the on-premises web app. The alignment helps minimize the replatforming effort. For example, use services that allow you to keep the same database engine and support existing middleware and frameworks. The following sections provide guidance for selecting the right Azure services for your web app.

For example, before the move to the cloud, Relecloud's ticketing web app was an on-premises, monolithic, ASP.NET app. It ran on two virtual machines and had a Microsoft SQL Server database. The web app suffered from common challenges in scalability and feature deployment. This starting point, their business goals, and SLO drove their service choices.

  • Application platform: Use Azure App Service as your application platform. Relecloud chose Azure App Service as the application platform for the following reasons:

    • High service level agreement (SLA): It has a high SLA that meets the production environment SLO of 99.9%.
    • Reduced management overhead: It's a fully managed solution that handles scaling, health checks, and load balancing.
    • .NET support: It supports the version of .NET that the application is written in.
    • Containerization capability: The web app can converge on the cloud without containerizing, but the application platform also supports containerization without changing Azure services.
    • Autoscaling: The web app can automatically scale in and out based on user traffic and configuration settings. The platform also supports scaling up or down to accommodate different hosting requirements.
  • Identity management: Use Microsoft Entra ID as your identity and access management solution. Relecloud chose Microsoft Entra ID for the following reasons:

    • Authentication and authorization: The application needs to authenticate and authorize call center employees.
    • Scalable: It scales to support larger scenarios.
    • User-identity control: Call center employees can use their existing enterprise identities.
    • Authorization protocol support: It supports OAuth 2.0 for managed identities.
  • Database: Use a service that allows you to keep the same database engine. Use the data store decision tree. Relecloud's web app used SQL Server on-premises. So they wanted to use the existing database schema, stored procedures, and functions. Several SQL products are available on Azure, but Relecloud chose Azure SQL Database for the following reasons:

    • Reliability: The general-purpose tier provides a high SLA and multi-region redundancy. It can support a high user load.
    • Reduced management overhead: It provides a managed SQL database instance.
    • Migration support: It supports database migration from on-premises SQL Server.
    • Consistency with on-premises configurations: It supports the existing stored procedures, functions, and views.
    • Resiliency: It supports backups and point-in-time restore.
    • Expertise and minimal rework: SQL Database takes advantage of in-house expertise and requires minimal work to adopt.
  • Application performance monitoring: Use Application Insights to analyze telemetry on your application. Relecloud chose to use Application Insights for the following reasons:

    • Integration with Azure Monitor: It provides the best integration with Azure Monitor.
    • Anomaly detection: It automatically detects performance anomalies.
    • Troubleshooting: It helps you diagnose problems in the running app.
    • Monitoring: It collects information about how users are using the app and allows you to easily track custom events.
    • Visibility gap: The on-premises solution didn't have application performance monitoring solution. Application Insights provides easy integration with the application platform and code.
  • Cache: Choose whether to add cache to your web app architecture. Azure Cache for Redis is Azure's primary cache solution. It's a managed in-memory data store based on the Redis software. Relecloud's web app load is heavily skewed toward viewing concerts and venue details, and it added Azure Cache for Redis for the following reasons:

    • Reduced management overhead: It's a fully managed service.
    • Speed and volume: It has high-data throughput and low latency reads for commonly accessed, slow changing data.
    • Diverse supportability: It's a unified cache location for all instances of the web app to use.
    • External data store: The on-premises application servers performed VM-local caching. This setup didn't offload highly frequented data, and it couldn't invalidate data.
    • Nonsticky sessions: Externalizing session state supports nonsticky sessions.
  • Load balancer: Web applications using PaaS solutions should use Azure Front Door, Azure Application Gateway, or both based on web app architecture and requirements. Use the load balancer decision tree to pick the right load balancer. Relecloud needed a layer-7 load balancer that could route traffic across multiple regions. Relecloud needed a multi-region web app to meet the SLO of 99.9%. Relecloud chose Azure Front Door for the following reasons:

    • Global load balancing: It's a layer-7 load balancer that can route traffic across multiple regions.
    • Web application firewall: It integrates natively with Azure Web Application Firewall.
    • Routing flexibility: It allows the application team to configure ingress needs to support future changes in the application.
    • Traffic acceleration: It uses anycast to reach the nearest Azure point of presence and find the fastest route to the web app.
    • Custom domains: It supports custom domain names with flexible domain validation.
    • Health probes: The application needs intelligent health probe monitoring. Azure Front Door uses responses from the probe to determine the best origin for routing client requests.
    • Monitoring support: It supports built-in reports with an all-in-one dashboard for both Front Door and security patterns. You can configure alerts that integrate with Azure Monitor. It lets the application log each request and failed health probes.
    • DDoS protection: It has built-in layer 3-4 DDoS protection.
    • Content delivery network: It positions Relecloud to use a content delivery network. The content delivery network provides site acceleration.
  • Web application firewall: Use Azure Web Application Firewall to provide centralized protection from common web exploits and vulnerabilities. Relecloud used Azure Web Application Firewall for the following reasons:

    • Global protection: It provides improved global web app protection without sacrificing performance.
    • Botnet protection: The team can monitor and configure settings to address security concerns related to botnets.
    • Parity with on-premises: The on-premises solution was running behind a web application firewall managed by IT.
    • Ease of use: Web Application Firewall integrates with Azure Front Door.
  • Configuration storage: Choose whether to add app configuration storage to your web app. Azure App Configuration is a service for centrally managing application settings and feature flags. Review App Configuration best practices to decide whether this service is a good fit for your app. Relecloud wanted to replace file-based configuration with a central configuration store that integrates with the application platform and code. They added App Configuration to the architecture for the following reasons:

    • Flexibility: It supports feature flags. Feature flags allow users to opt in and out of early preview features in a production environment without redeploying the app.
    • Supports Git pipeline: The source of truth for configuration data needed to be a Git repository. The pipeline needed to update the data in the central configuration store.
    • Supports managed identities: It supports managed identities to simplify and help secure the connection to the configuration store.
  • Secrets manager: Use Azure Key Vault if you have secrets to manage in Azure. You can incorporate Key Vault in .NET apps by using the ConfigurationBuilder object. Relecloud's on-premises web app stored secrets in code configuration files, but it's a better security practice to store secrets in a location that supports RBAC and audit controls. While managed identities are the preferred solution for connecting to Azure resources, Relecloud had application secrets they needed to manage. Relecloud used Key Vault for the following reasons:

    • Encryption: It supports encryption at rest and in transit.
    • Managed identity support: The application services can use managed identities to access the secret store.
    • Monitoring and logging: It facilitates audit access and generates alerts when stored secrets change.
    • Integration: It provides native integration with the Azure configuration store (App Configuration) and web hosting platform (App Service).
  • Storage solution: Review the Azure storage options to pick the right storage solution based on your requirements. Relecloud's on-premises web app had disk storage mounted to each web server, but the team wanted to use an external data storage solution. Relecloud chose Azure Blob Storage for the following reasons:

    • Secure access: The web app can eliminate endpoints for accessing storage exposed to the public internet with anonymous access.
    • Encryption: It encrypts data at rest and in transit.
    • Resiliency: It supports zone-redundant storage (ZRS). Zone-redundant storage replicates data synchronously across three Azure availability zones in the primary region. Each availability zone is in a separate physical location that has independent power, cooling, and networking. This configuration should make the ticketing images resilient against loss.
  • Endpoint security: Use Azure Private Link to access platform-as-a-service solutions over a private endpoint in your virtual network. Traffic between your virtual network and the service travels across the Microsoft backbone network. Relecloud chose Private Link for the following reasons:

    • Enhanced security communication: It lets the application privately access services on the Azure platform and reduces the network footprint of data stores to help protect against data leakage.
    • Minimal effort: The private endpoints support the web app platform and database platform the web app uses. Both platforms mirror existing on-premises configurations for minimal change.
  • Network security: Use Azure Firewall to control inbound and outbound traffic at the network level. Use Azure Bastion to connect to virtual machines securely without exposing RDP/SSH ports. Relecloud adopted a hub and spoke network topology and wanted to put shared network security services in the hub. Azure Firewall improves security by inspecting all outbound traffic from the spokes to increase network security. Relecloud needed Azure Bastion for secure deployments from a jump host in the DevOps subnet.

Code guidance

To successfully move a web app to the cloud, you need to update your web app code with the Retry pattern, Circuit-Breaker pattern, and Cache-Aside design pattern.

Diagram showing the role of the design patterns in the essential reliable web app architecture.

Figure 3. Role of the design patterns.

Each design pattern provides workload design benefits that align with one of more pillars of the Well-Architected Framework. Here's an overview of the patterns you should implement:

  1. Retry pattern: The Retry pattern handles transient failures by retrying operations that might fail intermittently. Implement this pattern on all outbound calls to other Azure services.

  2. Circuit Breaker pattern: The Circuit Breaker pattern prevents an application from retrying operations that aren't transient. Implement this pattern in all outbound calls to other Azure services.

  3. Cache-Aside pattern: The Cache-Aside pattern adds to and retrieves from a cache more frequently than a datastore. Implement this pattern on requests to the database.

Design pattern Reliability (RE) Security (SE) Cost Optimization (CO) Operational Excellence (OE) Performance Efficiency (PE) Supporting WAF principles
Retry pattern RE:07
Circuit-Breaker pattern RE:03
RE:07
PE:07
PE:11
Cache Aside pattern RE:05
PE:08
PE:12

Implement the Retry pattern

Add the Retry pattern to your application code to address temporary service disruptions. These disruptions are called transient faults. Transient faults usually resolve themselves within seconds. The Retry pattern allows you to resend failed requests. It also allows you to configure the request delays and the number of attempts before failure is conceded.

  • Use built-in retry mechanisms Use the built-in retry mechanism that most Azure services have to expedite the implementation. For example, the reference implementation uses the connection resiliency in Entity Framework Core to apply the Retry pattern in requests to Azure SQL Database (see the following code).

    services.AddDbContextPool<ConcertDataContext>(options => options.UseSqlServer(sqlDatabaseConnectionString,
        sqlServerOptionsAction: sqlOptions =>
        {
            sqlOptions.EnableRetryOnFailure(
            maxRetryCount: 5,
            maxRetryDelay: TimeSpan.FromSeconds(3),
            errorNumbersToAdd: null);
        }));
    
  • Use retry programming libraries. For HTTP communications, integrate a standard resilience library such as Polly or Microsoft.Extensions.Http.Resilience. These libraries offer comprehensive retry mechanisms that are crucial for managing communications with external web services. For example, the reference implementation uses Polly to enforce the Retry pattern every time the code constructs an object that calls the IConcertSearchService object (see the following code).

    private void AddConcertSearchService(IServiceCollection services)
    {
        var baseUri = Configuration["App:RelecloudApi:BaseUri"];
        if (string.IsNullOrWhiteSpace(baseUri))
        {
            services.AddScoped<IConcertSearchService, MockConcertSearchService>();
        }
        else
        {
            services.AddHttpClient<IConcertSearchService, RelecloudApiConcertSearchService>(httpClient =>
            {
                httpClient.BaseAddress = new Uri(baseUri);
                httpClient.DefaultRequestHeaders.Add(HeaderNames.Accept, "application/json");
                httpClient.DefaultRequestHeaders.Add(HeaderNames.UserAgent, "Relecloud.Web");
            })
            .AddPolicyHandler(GetRetryPolicy())
            .AddPolicyHandler(GetCircuitBreakerPolicy());
        }
    }
    
    private static IAsyncPolicy<HttpResponseMessage> GetRetryPolicy()
    {
        var delay = Backoff.DecorrelatedJitterBackoffV2(TimeSpan.FromMilliseconds(500), retryCount: 3);
        return HttpPolicyExtensions
          .HandleTransientHttpError()
          .OrResult(msg => msg.StatusCode == System.Net.HttpStatusCode.NotFound)
          .WaitAndRetryAsync(delay);
    }
    

Implement the Circuit Breaker pattern

Use the Circuit Breaker pattern to handle service disruptions that aren't transient faults. The Circuit Breaker pattern prevents an application from continuously attempting to access a nonresponsive service. It releases the application and avoids wasting CPU cycles so the application retains its performance integrity for end users.

For example, the reference implementation applies the Circuit Breaker pattern on all requests to the API. It uses the HandleTransientHttpError logic to detect HTTP requests that it can safely retry but limits the number of aggregate faults over a specified period of time (see the following code).

private static IAsyncPolicy<HttpResponseMessage> GetCircuitBreakerPolicy()
{
    return HttpPolicyExtensions
        .HandleTransientHttpError()
        .OrResult(msg => msg.StatusCode == System.Net.HttpStatusCode.NotFound)
        .CircuitBreakerAsync(5, TimeSpan.FromSeconds(30));
}

Implement the Cache-Aside pattern

Add the Cache-Aside pattern to your web app to improve in-memory data management. The pattern assigns the application the responsibility of handling data requests and ensuring consistency between the cache and a persistent storage, such as a database. It shortens response times, enhances throughput, and reduces the need for more scaling. It also reduces the load on the primary datastore, improving reliability and cost optimization. To implement the Cache-Aside pattern, follow these recommendations:

  • Configure the application to use a cache. Production apps should use the Distributed Redis Cache because it improves performance by reducing database queries and it enables nonsticky sessions so that the load balancer can evenly distribute traffic. For example, the reference implementation uses distributed Redis cache. The AddAzureCacheForRedis method configures the application to use Azure Cache for Redis (see the following code).

    private void AddAzureCacheForRedis(IServiceCollection services)
    {
        if (!string.IsNullOrWhiteSpace(Configuration["App:RedisCache:ConnectionString"]))
        {
            services.AddStackExchangeRedisCache(options =>
            {
                options.Configuration = Configuration["App:RedisCache:ConnectionString"];
            });
        }
        else
        {
            services.AddDistributedMemoryCache();
        }
    }
    
  • Cache high-need data. Apply the Cache-Aside pattern on high-need data to amplify its effectiveness. Use Azure Monitor to track the CPU, memory, and storage of the database. These metrics help you determine whether you can use a smaller database SKU after applying the Cache-Aside pattern. For example, the reference implementation caches high-need data that supports the Upcoming Concerts page. The GetUpcomingConcertsAsync method pulls data into the Redis cache from the SQL Database and populates the cache with the latest concerts data (see following code).

    public async Task<ICollection<Concert>> GetUpcomingConcertsAsync(int count)
    {
        IList<Concert>? concerts;
        var concertsJson = await this.cache.GetStringAsync(CacheKeys.UpcomingConcerts);
        if (concertsJson != null)
        {
            // There is cached data. Deserialize the JSON data.
            concerts = JsonSerializer.Deserialize<IList<Concert>>(concertsJson);
        }
        else
        {
            // There's nothing in the cache. Retrieve data 
            // from the repository and cache it for one hour.
            concerts = await this.database.Concerts.AsNoTracking()
                .Where(c => c.StartTime > DateTimeOffset.UtcNow && c.IsVisible)
                .OrderBy(c => c.StartTime)
                .Take(count)
                .ToListAsync();
            concertsJson = JsonSerializer.Serialize(concerts);
            var cacheOptions = new DistributedCacheEntryOptions {
                AbsoluteExpirationRelativeToNow = TimeSpan.FromHours(1)
            };
            await this.cache.SetStringAsync(CacheKeys.UpcomingConcerts, concertsJson, cacheOptions);
        }
        return concerts ?? new List<Concert>();
    }
    
  • Keep cache data fresh. Schedule regular cache updates to sync with the latest database changes. Determine the optimal refresh rate based on data volatility and user needs. This practice ensures the application uses the Cache-Aside pattern to provide both rapid access and current information. For example, the reference implementation caches data only for one hour and uses the CreateConcertAsync method to clear the cache key when the data changes (see the following code).

    public async Task<CreateResult> CreateConcertAsync(Concert newConcert)
    {
        database.Add(newConcert);
        await this.database.SaveChangesAsync();
        this.cache.Remove(CacheKeys.UpcomingConcerts);
        return CreateResult.SuccessResult(newConcert.Id);
    }
    
  • Ensure data consistency. Implement mechanisms to update the cache immediately after any database write operation. Use event-driven updates or dedicated data management classes to ensure cache coherence. Consistently synchronizing the cache with database modifications is central to the Cache-Aside pattern. For example, the reference implementation uses the UpdateConcertAsync method to keep the data in the cache consistent (see the following code).

    public async Task<UpdateResult> UpdateConcertAsync(Concert existingConcert), 
    {
       database.Update(existingConcert);
       await database.SaveChangesAsync();
       this.cache.Remove(CacheKeys.UpcomingConcerts);
       return UpdateResult.SuccessResult();
    }
    

Configuration guidance

The following sections provide guidance on implementing the configurations updates. Each section aligns with one or more pillars of the Well-Architected Framework.

Configuration Reliability (RE) Security (SE) Cost Optimization (CO) Operational Excellence (OE) Performance Efficiency (PE) Supporting WAF principles
Configure user authentication & authorization SE:05
OE:10
Implement managed identities SE:05
OE:10
Right size environments CO:05
CO:06
Implement autoscaling RE:06
CO:12
PE:05
Automate resource deployment OE:05
Implement monitoring OE:07
PE:04

Configure user authentication and authorization

When you migrate web applications to Azure, configure user authentication and authorization mechanisms. Follow these recommendations:

  • Use an identity platform. Use the Microsoft Identity platform to set up web app authentication. This platform supports both single-tenant and multi-tenant applications, allowing users to sign in with their Microsoft identities or social accounts.

  • Create an app registration. Microsoft Entra ID requires an application registration in the primary tenant. The application registration ensures the users that get access to the web app have identities in the primary tenant.

  • Use platform features. Minimize the need for custom authentication code by using platform capabilities to authenticate users and access data. For example, App Service provides built-in authentication support, so you can sign in users and access data by writing minimal or no code in your web app.

  • Enforce authorization in the application. Use role-based access controls (RBAC) to assign least privileges to application roles. Define specific roles for different user actions to avoid overlap and ensure clarity. Map users to the appropriate roles and ensure they only have access to necessary resources and actions.

  • Prefer temporary access to storage. Use temporary permissions to safeguard against unauthorized access and breaches, such as shared access signatures (SASs). Use User Delegation SASs to maximize security when granting temporary access. It's the only SAS that uses Microsoft Entra ID credentials and doesn't require a permanent storage account key.

  • Enforce authorization in Azure. Use Azure RBAC to assign least privileges to user identities. Azure RBAC determines what Azure resources identities can access, what they can do with those resources, and what areas they have access to.

  • Avoid permanent elevated permissions. Use Microsoft Entra Privileged Identity Management to grant just-in-time access for privileged operations. For example, developers often need administrator-level access to create/delete databases, modify table schemas, and change user permissions. With just-in-time access, user identities receive temporary permissions to perform privileged tasks.

Implement managed identities

Use Managed Identities for all Azure services that support managed identities. A managed identity allows Azure resources (workload identities) to authenticate to and interact with other Azure services without managing credentials. Hybrid and legacy systems can keep on-premises authentication solutions to simplify the migration but should transition to managed identities as soon as possible. To implement managed identities, follow these recommendations:

  • Pick the right type of managed identity. Prefer user-assigned managed identities when you have two or more Azure resources that need the same set of permissions. This setup is more efficient than creating system-assigned managed identities for each of those resources and assigning the same permissions to all of them. Otherwise, use system-assigned managed identities.

  • Configure least privileges. Use Azure RBAC to grant only the permissions that are critical for the operations, such as CRUD actions in databases or accessing secrets. Workload identity permissions are persistent, so you can't provide just-in-time or short-term permissions to workload identities. If Azure RBAC doesn't cover a specific scenario, supplement Azure RBAC with Azure-service level access policies.

  • Secure remaining secrets. Store any remaining secrets in Azure Key Vault. Load secrets from Key Vault at application startup instead of during each HTTP request. High-frequency access within HTTP requests can exceed Key Vault transaction limits. Store application configurations in Azure App Configuration.

For example, the reference implementation uses the Authentication argument in the SQL database connection string so App Service can connect to the SQL database with a managed identity: Server=tcp:my-sql-server.database.windows.net,1433;Initial Catalog=my-sql-database;Authentication=Active Directory Default. It uses the DefaultAzureCredential to allow the web API to connect to Key Vault using a managed identity (see the following code).

    builder.Configuration.AddAzureAppConfiguration(options =>
    {
         options
            .Connect(new Uri(builder.Configuration["Api:AppConfig:Uri"]), new DefaultAzureCredential())
            .ConfigureKeyVault(kv =>
            {
                // Some of the values coming from Azure App Configuration
                // are stored in Key Vault. Use the managed identity
                // of this host for the authentication.
                kv.SetCredential(new DefaultAzureCredential());
            });
    });

Right size environments

Use the performance tiers (SKUs) of Azure services that meet the needs of each environment without excess. To right-size your environments, follow these recommendations:

  • Estimate costs. Use the Azure pricing calculator to estimate the cost of each environment.

  • Cost optimize production environments. Production environments need SKUs that meet the service level agreements (SLA), features, and scale needed for production. Continuously monitor resource usage and adjust SKUs to align with actual performance needs.

  • Cost optimize preproduction environments. Preproduction environments should use lower-cost resources, disable unneeded services, and apply discounts such as Azure Dev/Test pricing. Ensure preproduction environments are sufficiently similar to production to avoid introducing risks. This balance ensures that testing remains effective without incurring unnecessary costs.

  • Define SKUs using infrastructure as code (IaC). Implement IaC to dynamically select and deploy the correct SKUs based on the environment. This approach enhances consistency and simplifies management.

For example, the reference implementation uses Bicep parameters to deploy more expensive tiers (SKUs) to the production environment.

    var redisCacheSkuName = isProd ? 'Standard' : 'Basic'
    var redisCacheFamilyName = isProd ? 'C' : 'C'
    var redisCacheCapacity = isProd ? 1 : 0

Implement autoscaling

Autoscaling ensures that a web app remains resilient, responsive, and capable of handling dynamic workloads efficiently. To implement autoscaling, follow these recommendations:

  • Automate scale-out. Use Azure autoscale to automate horizontal scaling in production environments. Configure autoscaling rules to scale out based on key performance metrics, so your application can handle varying loads.

  • Refine scaling triggers. Begin with CPU utilization as your initial scaling trigger if you're unfamiliar with your application’s scaling requirements. Refine your scaling triggers to include other metrics such as RAM, network throughput, and disk I/O. The goal is to match your web application's behavior for better performance.

  • Provide a scale-out buffer. Set your scaling thresholds to trigger before reaching maximum capacity. For example, configure scaling to occur at 85% CPU utilization rather than waiting until it reaches 100%. This proactive approach helps maintain performance and avoid potential bottlenecks.

Automate resource deployment

Use automation to deploy and update Azure resources and code across all environments. Follow these recommendations:

  • Use infrastructure as code. Deploy infrastructure as code through continuous integration and continuous delivery (CI/CD) pipelines. Azure has premade Bicep, ARM (JSON), and Terraform templates for every Azure resource.

  • Use a continuous integration/continuous deployment (CI/CD) pipeline. Use a CI/CD pipeline to deploy code from source control to your various environments, such as test, staging, and production. Utilize Azure Pipelines if you're working with Azure DevOps or GitHub Actions for GitHub projects.

  • Integrate unit testing. Prioritize the execution and passing of all unit tests within your pipeline before any deployment to App Services. Incorporate code quality and coverage tools like SonarQube to achieve comprehensive testing coverage.

  • Adopt mocking framework. For testing involving external endpoints, utilize mocking frameworks. These frameworks allow you to create simulated endpoints. They eliminate the need to configure real external endpoints and ensure uniform testing conditions across environments.

  • Perform security scans. Employ static application security testing (SAST) to find security flaws and coding errors in your source code. Additionally, conduct software composition analysis (SCA) to examine third-party libraries and components for security risks. Tools for these analyses are readily integrated into both GitHub and Azure DevOps.

Implement monitoring

Implement application and platform monitoring to enhance the operational excellence and performance efficiency of your web app. To implement monitoring, follow these recommendations:

  • Collect application telemetry. Use autoinstrumentation in Azure Application Insights to collect application telemetry, such as request throughput, average request duration, errors, and dependency monitoring, with no code changes.

    The reference implementation uses AddApplicationInsightsTelemetry from the NuGet package Microsoft.ApplicationInsights.AspNetCore to enable telemetry collection (see the following code).

    public void ConfigureServices(IServiceCollection services)
    {
       ...
       services.AddApplicationInsightsTelemetry(Configuration["App:Api:ApplicationInsights:ConnectionString"]);
       ...
    }
    
  • Create custom application metrics. Use code-based instrumentation for custom application telemetry. Add the Application Insights SDK to your code and use the Application Insights API.

    The reference implementation gathers telemetry on events related to cart activity. this.telemetryClient.TrackEvent counts the tickets added to the cart. It supplies the event name (AddToCart) and specifies a dictionary that has the concertId and count (see the following code).

    this.telemetryClient.TrackEvent("AddToCart", new Dictionary<string, string> {
        { "ConcertId", concertId.ToString() },
        { "Count", count.ToString() }
    });
    
  • Monitor the platform. Enable diagnostics for all supported services and send diagnostics to the same destination as the application logs for correlation. Azure services create platform logs automatically but only stores them when you enable diagnostics. Enable diagnostic settings for each service that supports diagnostics.

Deploy the reference implementation

The reference implementation guides developers through a simulated migration from an on-premises ASP.NET application to Azure, highlighting necessary changes during the initial adoption phase. This example uses a concert-ticketing application for the fictional company Relecloud, which sells tickets through its on-premises web application. Relecloud set the following goals for their web application:

  • Implement low-cost, high-value code changes
  • Achieve a service level objective (SLO) of 99.9%
  • Adopt DevOps practices
  • Create cost-optimized environments
  • Enhance reliability and security

Relecloud determined that their on-premises infrastructure wasn't a cost-effective solution to meet these goals. They decided that migrating their CAMS web application to Azure was the most cost effective way to achieve their immediate and future goals. The following architecture represents the end-state of Relecloud's Reliable Web App pattern implementation.

Diagram showing the architecture of the reference implementation. Figure 3. Architecture of the reference implementation. Download a Visio file of this architecture.