Blue Green Deployments with cookiecutter-django, AWS and Docker

So as promised, we're going to use some bash scripts to handle our deployments. First, though, we're going to change one small thing:

production.yml

services:
  django:
    build:
      context: .
      dockerfile: ./compose/production/django/Dockerfile
    depends_on:
      - redis
    env_file: .env
    environment:
        - DATABASE_URL=postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@${POSTGRES_HOST}:5432/${POSTGRES_DB_NAME}
    restart: always
    volumes:
      - /app
    command: /gunicorn.sh

We want to use docker-compose exec to run some commands on already existing containers, but cookiecutter-django exports DATABASE_URL in entrypoint.sh. This is fine when running Django management commands with docker-compose run, but it will error when doing it with docker-compose exec, because it won't be able to find the DATABASE_URL environment variable. The reason we set it on production.yml is that we can do variable substitution here, using variables from the .env file. I couldn't find a way of reusing variables like this in the .env file. We may also want to set REDIS_URL in our .env file, as this is another variable that is set in entrypoint.sh.

For the bash scripts we're going to need the AWS Command Line Interface, so head over there and grab the latest version!

aws-script-config.sh

#CONFIGURATION
#LOAD_BALANCER_ARN - ARN of your load balancer
#LABEL - Label to identify this project
#MACHINE_PREFIX - Prefix name to use for Docker machines
GREEN='\033[0;32m'
NO_COLOR='\033[0m'
LOAD_BALANCER_ARN=
LABEL=project=yourproject
MACHINE_PREFIX=yourproject-prod-


ACTIVE_TARGET_GROUP=$(aws elbv2 describe-target-groups --load-balancer-arn $LOAD_BALANCER_ARN --query TargetGroups[].TargetGroupArn | grep arn | cut -f2 -d\") #"
INACTIVE_TARGET_GROUP=$(aws elbv2 describe-target-groups --query TargetGroups[].TargetGroupArn | grep arn | grep -v $ACTIVE_TARGET_GROUP | cut -f2 -d\") #"

function launch_machine {
    NEW_MACHINE_NAME=$MACHINE_PREFIX$(openssl rand -hex 4)
    echo -e "${GREEN}Launching $NEW_MACHINE_NAME${NO_COLOR}"
    docker-machine create -d amazonec2 --engine-label $LABEL $NEW_MACHINE_NAME
    eval $(docker-machine env $NEW_MACHINE_NAME)
    docker-compose -f production.yml up  --build -d
}

function register_machines {
    echo -e "${GREEN}Registering $NEW_TARGET${NO_COLOR}"
    aws elbv2 register-targets --target-group-arn $ACTIVE_TARGET_GROUP --targets $NEW_TARGET
}

function remove_machines {
    echo -e "${GREEN}Removing $OLD_RUNNING_MACHINES${NO_COLOR}"
    docker-machine rm $OLD_RUNNING_MACHINES
}

function switch_target_groups {
    LISTENERS=($(aws elbv2 describe-listeners --load-balancer-arn $LOAD_BALANCER_ARN --query Listeners[].ListenerArn | grep arn | cut -f2 -d\" | tr "\n" " "))#"

    for listener in "${LISTENERS[@]}"
    do
        aws elbv2 modify-listener --listener-arn $listener --default-actions Type=forward,TargetGroupArn=$INACTIVE_TARGET_GROUP
    done
}

This script contains some configuration variables and some common functions between scripts. We need to set up the LOAD_BALANCER_ARN, LABEL, and MACHINE_PREFIX variables. MACHINE_PREFIX is the prefix we're going to use for our docker machines, and LABEL is an identifier we use to filter docker machines. All of our scripts will source this script.

add-machine.sh

#!/bin/bash
#This script build and registers a single new machine on AWS.

set -o errexit
set -o pipefail
set -o nounset

source aws-script-config.sh

launch_machine
NEW_TARGET=$(docker-machine inspect $NEW_MACHINE_NAME | grep InstanceId | cut -f4 -d\" | awk '{print " Id=" $0}')#"
register_machines

This script will launch a new Docker machine and register it with our Elastic Load Balancer's active target group.

remove-machine.sh

#!/bin/bash

#This script currently removes the first machine it sees.

#TODO: Don't execute if its the last machine

set -o errexit
set -o pipefail
set -o nounset

source aws-script-config.sh

OLD_RUNNING_MACHINES=$(docker-machine ls --filter state=Running --filter driver=amazonec2 --filter label=$LABEL --format {{.Name}} | grep $MACHINE_PREFIX | head -n1)
remove_machines

This script will grab the first running machine it sees, regardless if it is in the ELB's active target group or not, and remove it. Make sure that the only running Docker machines are the ones in the active target group.

deploy-blue-green.sh

#!/bin/bash


#This script will launch the same number of current running machines, and place them in the inactive target group
#It then puts the Django project in maintenance mode and runs migrations
#Then it switches the target groups in the elastic load balancer and turns off maintenance mode
#Initial machine should be created with add-machine.sh
#If everything seems okay, you should run remove-inactive-blue-green.sh
#If not, rollback-blue-green.sh


set -o errexit
set -o pipefail
set -o nounset

source aws-script-config.sh

OLD_RUNNING_MACHINES=($(docker-machine ls --filter state=Running --filter driver=amazonec2 --filter label=$LABEL --format {{.Name}} | grep $MACHINE_PREFIX | tr '\n' ' '))#"


NEW_TARGET=''

for i in "${OLD_RUNNING_MACHINES[@]}"
do
    launch_machine
    NEW_TARGET+=$(docker-machine inspect $NEW_MACHINE_NAME | grep InstanceId | cut -f4 -d\" |  awk '{print " Id=" $0}')#"
done


#REGISTER MACHINES IN INACTIVE TARGET GROUP
aws elbv2 register-targets --target-group-arn $INACTIVE_TARGET_GROUP --targets $NEW_TARGET
#ALL MACHINES ARE UP, RUN MIGRATION

docker-compose -f production.yml exec django python /app/manage.py setmaintenance on

docker-compose -f production.yml exec django python /app/manage.py migrate

#SWITCH TARGET GROUPS
switch_target_groups

docker-compose -f production.yml exec django python /app/manage.py setmaintenance off

#DONT STOP MACHINES IN CASE WE NEED TO ROLLBACK

This script grabs the current number of running machines and launches one machine per. It then registers these in the ELB's inactive target group. Once they're up, maintenance mode is set to on, migrations are run, and the inactive target group is switched with the active target group in our Elastic Load Balancer. Maintenance mode is then set to off. As with the remove-machine.sh script, make sure that the only runnings machines are the ones in the active target group. After it is done and everything is okay, we can run remove-inactive-blue-green.sh to remove the inactive running machines. If everything is not okay and we need a rollback, we can run rollback-blue-green.sh.

remove-inactive-blue-green.sh

#!/bin/bash

#This script will remove the inactive, running machines
#Run it when you're sure you won't need a rollback

set -o errexit
set -o pipefail
set -o nounset

source aws-script-config.sh


INACTIVE_MACHINES=$(aws elbv2 describe-target-health --target-group-arn $INACTIVE_TARGET_GROUP --query TargetHealthDescriptions[].Target.Id | grep i | cut -f2 -d\" | tr "\n" " ")#"
INACTIVE_DOCKER_MACHINES=$(aws ec2 describe-instances --instance-ids $INACTIVE_MACHINES --query Reservations[].Instances[].KeyName | grep $MACHINE_PREFIX | cut -f2 -d\" | tr "\n" " ")#"
docker-machine rm $INACTIVE_DOCKER_MACHINES

A pretty simple script, it grabs all machines from the inactive group, gets the docker machine names, and deletes them.

rollback-blue-green.sh

#!/bin/bash

#Rollback our deployment

set -o errexit
#set -o pipefail
set -o nounset

source aws-script-config.sh

INACTIVE_MACHINE=$(aws elbv2 describe-target-health --target-group-arn $INACTIVE_TARGET_GROUP --query TargetHealthDescriptions[].Target.Id | grep i | cut -f2 -d\" | head -n1 | tr "\n" " ")#"
INACTIVE_DOCKER_MACHINE=$(aws ec2 describe-instances --instance-ids $INACTIVE_MACHINE --query Reservations[].Instances[].KeyName | grep $MACHINE_PREFIX | cut -f2 -d\" | tr "\n" " ")#"
eval $(docker-machine env $INACTIVE_DOCKER_MACHINE)
OLD_MIGRATIONS=$(docker-compose -f production.yml exec django python /app/manage.py showmigrations -p | grep \[X\] | cut -f3 -d" ")

ACTIVE_MACHINE=$(aws elbv2 describe-target-health --target-group-arn $ACTIVE_TARGET_GROUP --query TargetHealthDescriptions[].Target.Id | grep i | cut -f2 -d\" | head -n1 | tr "\n" " ")#"
ACTIVE_DOCKER_MACHINE=$(aws ec2 describe-instances --instance-ids $ACTIVE_MACHINE --query Reservations[].Instances[].KeyName | grep $MACHINE_PREFIX | cut -f2 -d\" | tr "\n" " ")#"
eval $(docker-machine env $ACTIVE_DOCKER_MACHINE)
NEW_MIGRATIONS=$(docker-compose -f production.yml exec django python /app/manage.py showmigrations -p | grep \[X\] | cut -f3 -d" ")
MIGRATION_DIFF=($(diff --suppress-common-lines -y <(echo "$NEW_MIGRATIONS") <(echo "$OLD_MIGRATIONS") | cut -f1 -d"<" | sed -e 's/[[:space:]]*$//' | tr "\n" " "))
if [ ! -z "$MIGRATION_DIFF" ]
then
    docker-compose -f production.yml exec django python /app/manage.py setmaintenance on

    for migration in ${MIGRATION_DIFF[@]}
    do
        APP_NAME=$(echo $migration | cut -f1 -d".")
        MIGRATION_NUMBER=$(printf %04d $(($(echo $migration | cut -f2 -d"." | cut -f1 -d"_" | sed 's/^0*//')-1)))
        echo -e "${GREEN}Reverting to $APP_NAME.$MIGRATION_NUMBER${NO_COLOR}"
        read -p "Are you sure? " -n 1 -r
        echo
        if [[ $REPLY =~ ^[Yy]$ ]]
        then
            docker-compose -f production.yml exec django python /app/manage.py migrate $APP_NAME $MIGRATION_NUMBER
        fi
    done
fi
switch_target_groups
docker-compose -f production.yml exec django python /app/manage.py setmaintenance off

This one is a work in progress, it needs more testing. It will grab the Django migration plan from an inactive machine and the Django migration plan from an active machine. It will store the diff between the plans in a list, this means, the new migrations that were applied. If this list isn't empty, the script will set maintenance mode on and iterate over the list. For each of these migrations, it will go back one migration. For example, blog.0003 was applied, it will go back to blog.0002. It is important that all of the migrations are numbered correctly and applied in order for this to work. Each app name and migration name is displayed and the script asks for confirmation. After the migrations are run, target groups are switched and maintenance mode is set to off.

set -o pipefail is commented out, because diff returns an exit status different to 0 if a difference is found between the files, and this will make our script abort.

The next post will be about setting an ELK stack (ElasticSearch, Logstash and Kibana). Coming soon, sit tight!