1,883 reads

5 Gotchas with Elastic Beanstalk and Django

by HashedIn TechnologiesJuly 26th, 2018

Too Long; Didn't Read

At <a href="https://hashedin.com/" target="_blank">HashedIn</a>, we commonly deploy Django based applications on AWS Elastic Beanstalk. While EB is great, it does have some edge cases. Here is a list of things you should be aware of if you are deploying a Django application.

Companies Mentioned

5 Gotchas with Elastic Beanstalk and Django

At HashedIn, we commonly deploy Django based applications on AWS Elastic Beanstalk. While EB is great, it does have some edge cases. Here is a list of things you should be aware of if you are deploying a Django application.

Aside: If you are starting a new project meant for elastic beanstalk, Django Project Template can simplify the configuration.

Gotcha #1: Auto Scaling Group health check doesn’t work as you’d think

Elastic Beanstalk lets you configure a health check URL. This health check URL is used by elastic load balancer to decide if instances are healthy. But, the auto scale group does not use this health check.

So if an instance health check fails for some reason — elastic load balancer will mark it as unhealthy and remove it from the load balancer. However, auto scale group still considers the instance to be healthy, and doesn’t relaunch the instance.

Elastic Beanstalk keeps it this way to give you the chance to ssh into the machine to find out what is wrong. If auto scaling group terminates the machine immediately, you won’t have that option.

The fix is to configure auto scale group to use elastic load balancer based health check. Adding the following to a config file under .ebextensions will solve the problem.

Resources:AWSEBAutoScalingGroup:Type: "AWS::AutoScaling::AutoScalingGroup"Properties:HealthCheckType: "ELB"HealthCheckGracePeriod: "600"

Credits:

Gotcha #2: Custom logs don’t work with Elastic Beanstalk

By default, the wsgi account doesn’t have write access to the current working directory, and so your log files won’t work. According to Beanstalk’s documentation, the trick is to write the log files under the /opt/python/log directory.

However, this doesn’t always work as expected. When django creates the log file in that directory, the log file is owned by root — and hence django cannot write to that file.

The trick is to run a small script as part of .ebextensions to fix this. Add the following content in .ebextensions/logging.config:

commands:01_change_permissions:command: chmod g+s /opt/python/log02_change_owner:command: chown root:wsgi /opt/python/log

With this change, you can now write your custom log files to this directory. As a bonus, when you fetch logs using elastic beanstalk console or the eb tool, your custom log files will also be downloaded.

Gotcha #3: Elastic load balancer health check does not set host header

Django’s ALLOWED_HOSTS setting requires you to whitelist host names that will be allowed. The problem is, elastic load balancer health check does not set host names when it makes requests. It instead connects directly to the private ip address of your instance, and therefore the HOST header is the private ip address of your instance.

There are several not-so-optimal solutions to the problem

Terminate health check on apache — for example, by setting the health check url to a static file served from apache. The problem with this approach is that if Django isn’t working, health check will not report a failure

Use TCP/IP based health check — this just checks if port 80 is up. This has the same problem — if Django isn’t work, health check will not report a failure

Set ALLOWED_HOSTS = [‘*’] — This disables HOST checks altogether, opening up security issues. Also, if you mess up DNS, you can very easily send QA traffic to production.

A slightly better solution is to detect the internal ip address of the server, and add it to ALLOWED_HOSTS at startup. Doing this reliably is a bit involved though. Here is a handy script that works assuming your EB environment is linux:

def is_ec2_linux():"""Detect if we are running on an EC2 Linux InstanceSee http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/identify_ec2_instances.html"""if os.path.isfile("/sys/hypervisor/uuid"):with open("/sys/hypervisor/uuid") as f:uuid = f.read()return uuid.startswith("ec2")return False

def get_linux_ec2_private_ip():"""Get the private IP Address of the machine if running on an EC2 linux server"""import urllib2if not is_ec2_linux():return Nonetry:response = urllib2.urlopen('http://169.254.169.254/latest/meta-data/local-ipv4')return response.read()except:return Nonefinally:if response:response.close()

# ElasticBeanstalk healthcheck sends requests with host header = internal ip# So we detect if we are in elastic beanstalk,# and add the instances private ip addressprivate_ip = get_linux_ec2_private_ip()if private_ip:ALLOWED_HOSTS.append(private_ip)

Depending on your situation, this may be more work than you care about — in which case you can simply set ALLOWED_HOSTS to [‘*’].

Gotcha #4: Apache Server on EB isn’t configured for performance

For performance reasons, you want text files to be compressed, usually using gzip. Internally, elastic beanstalk for python uses Apache web server, but it is not configured to gzip content.

This is easily fixed by adding yet another config file.

Also, if you are versioning your static files, you may want to set strong cache headers. By default, Apache doesn’t set these headers. This configuration file sets Cache-Control headers to any static file that has version number embedded.

Gotcha #5: Elastic Beanstalk will delete RDS database when you terminate your environment

For this reason, always create a RDS instance independently. Set the database parameters as an environment variable. In your django settings, use dj_database_url to parse the database credentials from the environment variable.

Originally published at hashedin.com on July 25, 2017.