Summary

Completed

Here are some of the key points presented in this module about elasticity:

  • VMs and other cloud resources rarely experience constant loads. Instead, they experience variable loads -- sometimes loads that vary by an order of magnitude or more over time.
  • Sizing compute capacity to fit peak loads ensures quality of service (QoS) but results in increased costs and energy usage.
  • Elasticity refers to the ability to add resources to handle higher loads and remove resources when the load decreases.
  • Elasticity is achieved in the cloud by scaling resources such as VMs and databases.
  • Scaling in and out (horizontal scaling) refers to increasing and decreasing the number of resources devoted to a task -- for example, increasing the number of VMs serving web-site users from 10 to 15.
  • Scaling up and down (vertical scaling) refers to replacing existing resources with more or less powerful ones -- for example, replacing a web-server VM containing 2 cores and 4 GB of RAM with one containing 4 cores and 8 GB of RAM.
  • Scaling resources to match demand keeps resource utilization relatively constant, lowers costs, and improves energy usage.
  • Autoscaling allows scaling to occur based on rules or policies established by a cloud administrator. The rules or policies can be time-based, metrics-based, or both. An example of metrics-based autoscaling is bringing additional instances online when average CPU utilization reaches a predetermined threshold such as 70%.
  • Time-based autoscaling, also known as scheduled autoscaling, is most appropriate when loads are cyclical and predictable.
  • Metrics-based autoscaling can handle both predictable and unpredictable loads.
  • Effective load balancing is crucial to implementing scalable cloud services.
  • Load balancers use different kinds of algorithms to distribute load, including round-robin and hashed-based algorithms.
  • Some load balancers attempt to dispatch requests more intelligently by using metrics such as request-execution time and CPU utilization at each node.
  • Load balancers also increase availability by monitoring the health of back-end resources and recognizing when those resources aren't available.
  • Because a single load balancer represents a single point of failure, load balancers are often deployed in pairs.
  • Serverless computing offers benefits that include consumption-based pricing, automatic scalability, and reduced administrative costs
  • One example of serverless computing is serverless functions, which let you upload code to the cloud and define when it executes.
  • Another example is serverless workflows, which let you define business workflows (typically using graphical designers and without writing code) and specify when they execute.
  • Serverless computing also extends to databases, which scale to meet the demand placed on them.