Autoscaling in AWS: Building highly available applications is a challenge on many fronts, and it comes even more challenging when we need to handle multiple servers.
It is quite common that the servers failing due to various reasons like increased traffic, memory consumption issues and other similar issues.
This demands for another infrastructure monitoring system which would detect the failures of the servers and report these to the devops team to take suitable actions like managing the memory or provisioning new replica servers.
All of these elevates the already intricate architecture in to a more complex one to monitor and manage.
In this blog, we will explicitly concentrate on how AWS can be used to manage such a highly available system by using its auto-scaling components like the Elastic Load Balancer.
We will see in this article, how Amazon EC2 instances can be configured using the Elastic Load Balancer with different scaling policies so that we can get rid of the tedious process of monitoring the entire server instances.
Autoscaling in AWS
Amazon auto-scaling allows the self-healing of the instances by launching new instances automatically and the proper implementation of the same can help us eliminating the single point of failures by effectively distributing the traffic across availability zones.
So essentially, auto-scaling ensures that there will be the minimum required a number of EC2 instances to run the application at any point of time without any failures.
It also makes sure that in case of increased traffic, the number of EC2 instances will be scaled up to the required level so that the application can run smoothly. Also when the traffic surge comes down, it will automatically scale down the number of EC2 instances to the minimum number.
This way, it ensures that we are paying only for what we use, at the same time maintaining the high availability of the application without any infrastructure involved. These are very useful and significant factors in today’s applications, not only considering the competitive market but also considering the critical data they are handling.Now let us dive in to the components of the auto-scaling features provided by the AWS. The below diagram shows the typical configuration and components of an autoscaling scenario handled by AWS.
Let us familiarise with each of the components in the above diagram:
These are Amazon instances classified logically based on their functionality in the application. To be more clear the EC2 instances that has similar functions or which are helpful to be grouped together for scaling/management, can be called as an auto scaling group.
Using this logical grouping, we can write policies for the auto scaling up or down of the instances in the group. The number of instances can be increased according to the load on the application for performance improvement, also this number can be decreased when the load goes down.
Another major function of these auto scaling groups is to poll the instances coming under it and ensure their availability and if at all one of these instances go down, the auto-scaling group maintains the number of instances by adding another instance to the group. This ensures the availability of a minimum number of servers needed for operation throughout the application lifecycle.
The autoscaling group launches new instances on demand. In order to aid the application, these instances should be pre-configured with the application related resources. Also the configuration of these instances on the OS level and the software/services to be installed on them post installation etc matters a lot.
This can be configured using the launch configuration, wherein we can configure the type of instance to be launched, the key pair to be used, the software/services to be installed/initiated after the bootup etc.
This makes the setup of the instances much easier as we need to write only one launch configuration per group in most cases and this launch configuration can be reused again and again for the launching of the new instances.
There are different types of scaling, which is provided by the auto scaling to scale the instances of a group.
2.1 Maintaining constant number of instances in the group
We can specify that at all time, we need x number of instances in the group and autoscaling just takes care of that. Nothing more nothing less.
Auto Scaling group will frequently monitor the health statistics of every instance in the group and if at all there is an unhealthy instance detected, it will terminate that instance and replace it with another healthy one. This essentially means that at all points of time the number of instances are constant.
2.2 Manual scaling
In the figure above, you can see the minimum, maximum and ideal number of instances. Manual scaling lets you configure the number of this minimum,maximum or ideal instances. Once configured, the rest is taken care by auto-scaling.
There are times when we can predict the traffic of an application, that is when the traffic is going to be high, when it will go down etc. These can happen in scenarios like when an e-commerce application decides to launch its offers, or when there are specific days for the application traffic to peak.
Schedule scaling provides a way for us to schedule the scaling parameters in advance so that, when that time reaches, autoscaling group handles the traffic elegantly by increasing the resources and when off the schedule will reduce the number of instances to the original number.
2.3 Dynamic scaling
This is the most recent addition to the scaling methods/types. By dynamic scaling, one can opt to scale in or out based on numerous health parameters, like memory, CPU utilisation etc.
We can configure that our app needs to scale out if the memory consumption goes above 85 percent. So when the threshold value for the memory is reached, the autoscaling group scales out. Here we need to use two policies, one for scaling out and another for scaling in.
There are mainly three types of scaling policies in AWS as of now:
Simple scaling policy allows to increase or decrease the number of instances of a scaling group based only on one scaling parameter. This was the first scaling policy that was introduced by AWS.
We can specify the scaling in and out policies when a target metric/metrics reaches specific ranges. That is we can do something like this, add 1 more instance in the CPU utilisation is between 40-50%, it it is between 50-70% add 3 more instances and if it rises again, add 4 instances more to the autoscaling group.
This policy would track a specific metric and based on the minimum/maximum thresholds the auto scaling group will scale in/out. That is, this policy would constantly watch if the target metric goes above the preset threshold value and then scales the group up, and if the target metric goes down, the autoscaling group is triggered to scale in.
4. Configuration – Launch configuration
Browse in to the EC2 instances dashboard as shown in the below figure. On the left, under the Auto Scaling tab, we have a “launch configuration” section.
After step 1 we will reach the “create launch configuration” page, where we need to click the “My AMIs” section. This section consists of all of the AMIs of servers of our applications. AMIs are simply the exact replica of the server (with services/settings etc) you are using in your instance for the application.
Now it is assumed that you have an AMI of the server/instance that is to be replicated. After this selecting the required AMI would direct you to the instance type page.
Here we get to choose the instance type on which the AMI should be running. This is very convenient if we want our instances to be of higher/lower capacities than the current server.
In this configuration tab, we get to chose the name for the configuration with another options like assigning an IAM role and enabling the cloudwatch for handling monitoring. Now we can press the skip to review button (3) as the next sections are storage and assigning security groups.
Here we can review the entire fields filled in by us and launch the configuration by pressing the button 2 labeled below and we are done with the launch configuration
5. Configuration – Autoscaling in AWS
In the previous section we saw the setting up of the “Launch Configuration” tab and after that let us see how the Auto Scaling tab can be configured.
Click on the Auto Scaling configuration tab in the EC2 instances dashboard of the AWS console as shown below:
Now in the resulting screen add click on the “Create Auto Scaling group” and we will be directed to the group creation page
Here we can provide a name for the group, the corresponding minimum size of instances to start with, the subnets and more importantly the frequency of health checks.
In this step, the scaling policies are set up. We have the option to swing between the minimum and a maximum number of instances in the scaling scenario. Also, we have the “
” section where we can configure the metrics and the target value for observation.
In the next 2 tabs (configure notifications and configure tags) , we have notifications and tag management which are configurable. Here we are not covering that. Let us click on the review button to go directly to the review
Now under the review section, we have the filled details reviewable and ready. Now clicking on the “create auto scaling group” would launch the group ready for auto-scaling.
As a final piece to this, we need to load balance this group via the AWS ELB.
In this article, we have familiarized with the auto-scaling feature for the EC2 instances which will ensure the high availability of applications. In the future blogs to this AWS series, we will be familiarizing with services like Cognito, CloudFront etc.