Better fault tolerance, better availability, better cost management.
To automatically distribute incoming application traffic across multiple instances in your Auto Scaling group, use Elastic Load Balancing. The load balancer and its target group must be in the same Region where you create your Auto Scaling group.
To monitor basic statistics for your instances and Amazon EBS volumes, use Amazon CloudWatch.
To monitor the calls made to the Amazon EC2 Auto Scaling API for your account, use AWS CloudTrail.
Instance Lifecycle
Amazon EC2 Auto Scaling monitors the health of each Amazon EC2 instance that it launches. When it finds that an instance is unhealthy, it terminates that instance and launches a new one. If you stop or terminate a running instance, the instance is considered to be unhealthy and is replaced. If any instance terminates unexpectedly, Amazon EC2 Auto Scaling detects the termination and launches a replacement instance. This capability enables you to maintain a fixed, desired number of EC2 instances automatically.
When a scale-out event occurs, the Auto Scaling group launches the required number of EC2 instances, using its assigned launch configuration. These instances start in the Pending
state. If you add a lifecycle hook to your Auto Scaling group, the instances move from the Pending state to the Pending:Wait state, and you can perform a custom action here. After you complete the lifecycle action, the instances enter the Pending:Proceed state.
When each instance is fully configured and passes the Amazon EC2 health checks, it is attached to the Auto Scaling group and it enters the InService
state.
Instances remain in the InService
state until one of the following occurs:
- A scale-in event occurs.
- You put the instance into a
Standby
state in order to troubleshoot or make changes to it. - You detach the instance from the Auto Scaling group.
- The instance fails a required number of health checks.
When a scale-in event occurs, the Auto Scaling group uses its termination policy to determine which instances to terminate. Instances that are in the process of detaching from the Auto Scaling group and shutting down enter the Terminating
state, and can't be put back into service. If you add a lifecycle hook to your Auto Scaling group, the instances move from the Terminating state to the Terminating:Wait state, and you can perform a custom action here. After you complete the lifecycle action, the instances enter the Terminating:Proceed state. Finally, the instances are completely terminated and enter the Terminated
state.
Scale-out event includes:
- You manually increase the size of the group.
- The size of the group is increased based on demand according to a scaling policy.
- You set up scaling by schedule to increase the size of the group at a specific time. F
Scale-in events includes:
- You manually decrease the size of the group.
- The size of the group is decreased based on demand according to a scaling policy.
- You set up scaling by schedule to decrease the size of the group at a specific time.
When a lifecycle action occurs, and an instance enters the wait state, scaling activities due to simple scaling policies are paused.
Groups
Your EC2 instances are organized into groups so that they can be treated as a logical unit for the purposes of scaling and management. An Auto Scaling group also enables you to use Amazon EC2 Auto Scaling features such as health check replacements and scaling policies.
An Auto Scaling group starts by launching enough instances to meet its desired capacity. It maintains this number of instances by performing periodic health checks on the instances in the group. If an instance becomes unhealthy, the group terminates the unhealthy instance and launches another instance to replace it.
An Auto Scaling group can launch On-Demand Instances, Spot Instances, or both.
Configuration Templates
Your group uses a launch template, or a launch configuration (not recommended, offers fewer features), as a configuration template for its EC2 instances.
When you create an Auto Scaling group, you must specify a launch configuration, a launch template, or an EC2 instance. When you create an Auto Scaling group using an EC2 instance, Amazon EC2 Auto Scaling automatically creates a launch configuration for you and associates it with the Auto Scaling group.
You can create a new launch template using your current launch configuration.
Advantages of Launch Templates
- Defining a launch template instead of a launch configuration allows you to have multiple versions of a template. With versioning, you can create a subset of the full set of parameters and then reuse it to create other templates or template versions.
- Not all Auto Scaling group features are available for a launch configuration. For example, you can specify both On-Demand Instances and Spot Instances for your Auto Scaling group only when you configure the group to use a launch template.
- Launch templates enable you to use newer features of Amazon EC2.
Scaling Options
Maintain a specific instance levels at all times
This can be achieved by setting the same value for minimum, maximum, and desired capacity.
Manual scaling
You can specify the minimum number of instances, the maximum number of instances, and the desired capacity in group. Auto Scaling will manages the process of creating or terminating instances to maintain the updated capacity.
You can also attach / detach instances from the group manually.
If you specified multiple Availability Zones, the desired capacity is distributed across these Availability Zones.
Scale automatically as a function of time and date (scale by schedule).
You can create scheduled actions for scaling one time only, or for scaling on a recurring schedule.
Dynamic scaling
A scaling policy instructs Amazon EC2 Auto Scaling to track a specific CloudWatch metric, and it defines what action to take when the associated CloudWatch alarm is in ALARM.
Capacity is measured in one of two ways: using the same units that you chose when you set the desired capacity in terms of instance, or using capacity units (if instance weighting is applied).
When a scaling policy is executed, if the capacity calculation produces a number outside of the minimum and maximum size range of the group, Amazon EC2 Auto Scaling ensures that the new capacity never goes outside of the minimum and maximum size limits.
The exception is when you use instance weighting. In this case, Amazon EC2 Auto Scaling can scale out above the maximum size limit, but only by up to your maximum instance weight. For example, if a scaling policy attaches 5 capacity units with instance weights 1:4:6. When it is triggered, actually 1+2+3=6 instances are attached to the group.
Scaling Policies
Target tracking scaling—Increase or decrease the current capacity of the group based on a target value for a specific metric. (e.g. SQS messages exceeds a specific number)
Step scaling—Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach. When step adjustments are applied, and they increase or decrease the current capacity of your Auto Scaling group, the adjustments vary based on the size of the alarm breach. (e.g. CPU utilization is over 90% for a specific period of time). Step scaling can continue to respond to additional alarms, even while a scaling activity or health check replacement is in progress.
Simple scaling—Increase or decrease the current capacity of the group based on a single scaling adjustment. After a scaling activity is started, the policy must wait for the scaling activity or health check replacement to complete and the cooldown period to expire before responding to additional alarms.
Step scaling policies and simple scaling policies both require you to create CloudWatch alarms for the scaling policies. Both require you to specify the high and low thresholds for the alarms. Both require you to define whether to add or remove instances, and how many, or set the group to an exact size.
Your Auto Scaling group can have more than one scaling policy. When there are multiple policies in force at the same time, there's a chance that each policy could instruct the Auto Scaling group to scale out (or in) at the same time. In these situations, Amazon EC2 Auto Scaling chooses the policy that provides the largest capacity for both scale out and scale in.
When the policies use different criteria for scaling in, AWS choose the policy that provides the largest capacity (detach less instances).
Use predictive scaling
Use Amazon EC2 Auto Scaling in combination with AWS Auto Scaling to maintain optimal availability and performance by combining predictive scaling and dynamic scaling
Because capacity fluctuates independently for each instance type in an Availability Zone, you can often get more compute capacity when you have instance type flexibility.
You can use scaling policies to increase or decrease the number of instances in your group dynamically to meet changing conditions. When the scaling policy is in effect, the Auto Scaling group adjusts the desired capacity of the group, between the minimum and maximum capacity values that you specify, and launches or terminates the instances as needed.
Scaling Cooldown
When you use simple scaling, after the Auto Scaling group scales using a simple scaling policy, it waits for a cooldown period to complete before any further scaling activities due to simple scaling policies can start. A scaling cooldown helps you prevent your Auto Scaling group from launching or terminating additional instances before the effects of previous activities are visible.
During a cooldown period, when a scheduled action starts at the scheduled time, or when scaling activities due to target tracking or step scaling policies start, they can trigger a scaling activity immediately without waiting for the cooldown period to expire. When you manually scale your Auto Scaling group, the default is not to wait for the cooldown period to complete, but you can override this behavior and honor the cooldown period when you call the API.
If an instance becomes unhealthy, Amazon EC2 Auto Scaling also does not wait for the cooldown period to complete before replacing the unhealthy instance.
You can modify the default cooldown period by editing the auto scaling group. You can also create cooldowns that apply to a specific simple scaling policy. A scaling-specific cooldown period overrides the default cooldown period.
With multiple instances, the cooldown period (either the default cooldown or the scaling-specific cooldown) takes effect starting when the last instance finishes launching or terminating.
Cooldown period normally starts after the instance moves out of the wait
state (after the lifecycle hook execution is complete). However, with Elastic Load Balancing, the Auto Scaling group starts the cooldown period when the terminating instance finishes connection draining (deregistration delay) by the load balancer and does not wait for the lifecycle hook. When a scal-out with ELB happens, instances proceed connection draining, then a lifecycle hook, and then a cooldown period.
Instance warmup period
The number of seconds that it takes for a newly launched instance to warm up. (Used for step scaling policies)
Avoid continuously scaling up when new instances are preparing for running
Alarm Sustain Period
Trigger an alarm if the CPU utilization > 90 more than a period of time
Avoid scaling up for short CPU peak
网友评论