Elastic Load Balancer
Solution Architect Associate
Developer Associate
- A virutal appliance that balances load of a web application across multiple instances
- Three types (Know these for the exam!!)
- Application Load Balancers: best suited for load balancing HTTP and HTTPS traffic. They operate at Layer 7 and are application aware. Can do route and path.
- Network Load Balancers: best suited for load balancing TCP traffic where extreme performance is required. Operating at Layer 4, capable of handling millions of requests per second with low latency. Used for extreme performance
- Classic Load Balancers: these are the legacy Elastic Load Balancers. You can load balance HTTP and HTTPS applications and use layer 7 specific features. You can also use strict Layer 4 load balancing for apps that rely strictly on TCP using the X-Forwarded-For header and sticky sessions. No longer really recommended, think about them as deprecated.
- Can manage internal traffic or external traffic. Option to be checked upon creation
- Automatically perform health checks periodically. You configure what to ping, as well as timeouts for issues. You can choose the interval between health checks as well.
- Unhealthy Threshold - number of consecutive ping failures to be reported unhealthy
- Healthy Threashold - number of consecutive ping successes for an unhealthy instance to be reported as healthy again
- Elastic Load Balancers will cost money when on. This is one of the easiest ways to get out of the free tier, forgetting to remove ELBs
- Once an EC2 instance is out of service, it will no longer forward traffic to that instance
- You won’t be given a public IP address for an ELB, you will only be given a DNS name. There is a public IP of course, but it is managed by AWS and can change. You should never use the public IP, only the DNS name (typically configured in Route 53)
- Instances are reported as InService or OutOfService by using health checks
- Read the ELB FAQ on Classic Load Balancers
Application Load Balancer
- Uses Target Groups…
- Functions on the 7th layer (Application Layer) of the OSI model
- After the load balancer receives a request, it evaluates the listener rules priority order to determine which rule to apply
- You must define a default rule for each listener you define, but you can optionally define additional rules as well
- Path Based Routing (common scenario for exams)
- Redirect a request to specific paths from one web server to another. Application Load balancers can do this by analyzing the path of the request. To do this you have to enable path patterns
- To use HTTPS, you must deploy at least one SSL/TLS server certificate to your load balancer. The load balancer terminates the front end connection and decrypts requests from clients before sending them to the targets (i.e. not end to end encryption)
- Application Load Balancers have a Security Group
Network Load Balancer
- Operates at Layer 4
- If you need thousands or tens of thousands of connections, look at a Network Load Balancer
- Supports TCP, TLS, UDP, TCP_UDP and ports 1-65535
- You can use a TLS listener to offload the work of encryption and decryption to the load balancer.
- Network Load Balancers don’t have a Security Group
Classic Load Balancer
- Legacy load balancer. You can use Layer 7 features as well as strict Layer 4 for apps that rely strictly on TCP
- If you need an external IP address for your connection, you can get it in the X-Forwarded-For header. (Know this for exam!!)
Errors
- 504: Gateway Timeout Error. If your app stops responding, the ELB responds with a 504.
- This means your app is having issues. Could be either at the web server or database layers
- Identify where the problem is and scale it up/out as needed
X-Forwarded-For Header
- The IPv4 public address will be included in this header. Can be used to better understand who is visiting your applications as the ELB will forward to internal addresses only
Launch Templates and Launch Configurations
- Requires an elastic load balancer (duh)
- Before you can create an auto-scaling group, you have to create a launch configurations
- Launch Templates vs. Launch Configuration
- A Launch Template is a collection of all of the settings you need to spin up an EC2 instance. They also support versioning, more granularity, etc.
- A Launch Configuration is basically the same set of steps as provisioning an instance from scratch
- AWS recommends that you should use Launch Templates more often than not (Know This for exam!!)
- Typically combined with a bash script to copy contents from an S3 bucket into the /var/www/html folder of Apache on a new instance
- You probably want to choose all availability zones in a region when creating an Auto Scaling Group to ensure redundancy.
- Be sure that you enable load balancing under “Advanced” if you want to use the ELB, you’ll also need to use the ELB for the health check type as well
- Be sure to give enough time in the grace period for your bash scripts to run in the health check grace period. Otherwise instances will report unhealthy too quickly
- If you include network information in a Launch Template, you can’t use it in an auto-scaling group
Auto-Scaling Groups
- Setup Auto-Scaling Restrictions
- Minimum: lowest number of EC2 instances to have online. Typically you want this set to 2, across two AZs to support high availability
- Maximum: highest number of EC2 instance to have online.
- Desired: the instances you want running at this moment. This constantly changes, and will always be between the minimum and maximum
- When creating an Auto-Scaling group, you select a template, and a version of that template.
- You can use spot instances in an auto scaling group to save money (Know for exam!!)
- You pick the VPC, and subnets to use
- You can attach a load balancer (either to an existing or a new one)
- You specificy health checks, it uses the EC2 health check by default. You can opt into the ELB health check
- Steady-state Autoscaling Group: If you have a min, max and desired capacity set to 1
Auto Scaling Policies
- Allow you to specify boundaries to trigger auto-scaling
- Warm up period ensures that your health checks don’t immediately fail when new instances are spun up but are not ready to receieve traffic
- Cool down period pauses auto-scaling for a set amount of time to avoid runaway events. Default is 5 minutes
- Warm up and cool down together help to avoid thrashing
- Multiple types of auto-scaling are available
- Reactive scaling: You are responding to events to determine when to scale
- Scheduled scaling: If you have a predictable workload, you can pre-emptively provision resources to have them ready
- Predictive scaling: AWS uses machine learning algorightms to determine when you need to scale. Re-evaluated every 24 hours to create a forecast for the next 48 hours
Sticky Sessions
- Classic load balancer routes each request to the EC2 instance with the smallest load. Sticky sessions ensure all requests during a session are sent to the same instance
Deregistration Delay
- Allows load balancers to keep existing connections open if the EC2 instances are de-registered or become unhealthy
- Enables load balancers to complete in-flight requests made to instances that are de-registered. Can be disabled to prevent this, and have the load balancer immediately close connections to unhealthy instances
- In a Classic Load Balancer, this is called Connection Draining