Summary

6 minutes

Here are some of the key points presented in this module about elasticity:

VMs and other cloud resources rarely experience constant loads. Instead, they experience variable loads -- sometimes loads that vary by an order of magnitude or more over time.
Sizing compute capacity to fit peak loads ensures quality of service (QoS) but results in increased costs and energy usage.
Elasticity refers to the ability to add resources to handle higher loads and remove resources when the load decreases.
Elasticity is achieved in the cloud by scaling resources such as VMs and databases.
Scaling in and out (horizontal scaling) refers to increasing and decreasing the number of resources devoted to a task -- for example, increasing the number of VMs serving web-site users from 10 to 15.
Scaling up and down (vertical scaling) refers to replacing existing resources with more or less powerful ones -- for example, replacing a web-server VM containing 2 cores and 4 GB of RAM with one containing 4 cores and 8 GB of RAM.
Scaling resources to match demand keeps resource utilization relatively constant, lowers costs, and improves energy usage.
Autoscaling allows scaling to occur based on rules or policies established by a cloud administrator. The rules or policies can be time-based, metrics-based, or both. An example of metrics-based autoscaling is bringing additional instances online when average CPU utilization reaches a predetermined threshold such as 70%.
Time-based autoscaling, also known as scheduled autoscaling, is most appropriate when loads are cyclical and predictable.
Metrics-based autoscaling can handle both predictable and unpredictable loads.
Effective load balancing is crucial to implementing scalable cloud services.
Load balancers use different kinds of algorithms to distribute load, including round-robin and hashed-based algorithms.
Some load balancers attempt to dispatch requests more intelligently by using metrics such as request-execution time and CPU utilization at each node.
Load balancers also increase availability by monitoring the health of back-end resources and recognizing when those resources aren't available.
Because a single load balancer represents a single point of failure, load balancers are often deployed in pairs.
Serverless computing offers benefits that include consumption-based pricing, automatic scalability, and reduced administrative costs
One example of serverless computing is serverless functions, which let you upload code to the cloud and define when it executes.
Another example is serverless workflows, which let you define business workflows (typically using graphical designers and without writing code) and specify when they execute.
Serverless computing also extends to databases, which scale to meet the demand placed on them.

Continue

Feedback