9,962 reads

I have a confession to make… I commit to master.

by Patrick Lee ScottNovember 4th, 2017

Too Long; Didn't Read

Patrick Lee Scott uses Docker Compose and Docker Multi-Stage builds to build his own Continuous Integration pipeline. The pipeline allows him and his team members to commit directly to master. Every commit that you make is linted, small errors are automatically fixed, your code is thoroughly tested, and your coverage thresholds are enforced. The process is completed in a brand new container, ensuring no global dependencies are forgotten. Tests are executed against the finished container to prove it integrates properly with dependencies such as databases, message queues, and other services (integration tests) If all goes well, the code is successfully pushed to GitHub, where my Continuous Deployment server picks it up and finishes the job.

Companies Mentioned

Coin Mentioned

featured image - I have a confession to make… I commit to master.

I used to preach about Git Flow to keep my code releasable, rollback-able, and keep a clean history. But not anymore — now, bad code doesn’t make it into my codebase.

This is because I have a robust continuous deployment pipeline.

Having this pipeline allows me and my team members to commit directly to master.

I can hear the pitchforks being sharpened already. “NEVER commit to master!”

I know, I know, I used to say the same thing. “You need to have branches for keeping your code organized, and be able to review it!” But like, who wants to do that? Automate it!

Imagine it. Every commit that you make is linted, small errors are automatically fixed, your code is thoroughly tested, and your coverage thresholds are enforced. The process is completed in a brand new container, ensuring no global dependencies are forgotten. Then, tests are executed against the finished container to prove it integrates properly with dependencies such as databases, message queues, and other any other services (integration tests). If all goes well, the code is successfully pushed to GitHub, where my Continuous Deployment server picks it up and finishes the job.

If it makes it to deployment, I’m positive it works. It’s been well tested.

I’m sure you’re thinking at this point “sounds like a lot of work”.

Luckily for you I’ve put in months of trial and error, and now I’d like to present to you, the process and techniques I use day-to-day to confidently commit to master. By the end of this post you will have all of the skills required to build your own production optimized Dockerized Continuous Integration process, and run it on every commit by making use of Docker Compose and Docker Multi-Stage builds.

master will always be safe to deploy.

NOTE: If you’re a Node.js engineer, you’re in luck! Examples are for Node. For everyone else, the concepts and techniques are the same. You should be able to extrapolate how the process should work for your languages of choice.

But first, a little about the different stages of Continuous deployment.

Continuous Integration vs. Continuous Delivery vs. Continuous Deployment

Often these terms are used interchangeably, but that is not correct. In the chart below I’ve defined which tasks fit into which areas.

For a while now, I’ve been advocating usage of the docker-compose builder pattern for continuous integration pipelines. With the advent of Docker multi-stage builds, however, it’s now easier to get smaller and more efficient containers.

Docker is essentially, an isolated environment for your code to run in. Just like you would provision a server, you provision a docker container. Most popular frameworks, and software have builds available from Docker Hub. Seems how we are using Node, we need an environment that runs node. We’ll start our Dockerfile with that.

# Dockerfile

FROM node:9-alpine AS build

Note the AS directive. This signals that this is not the final stage of the Dockerfile. Later on we can COPY artifacts out of this stage into our final container. Let’s move on.

For complication’s sake, let’s say we are using a library which requires node-gyp to install dependencies properly because it needs to compile native c++ binaries for the OS you are running on. In most cases you won’t need this, but some popular libraries, like redis require it.

# Dockerfile continued

# optionally install gyp toolsRUN apk add --update --no-cache \python \make \g++

That’s probably about as complicated as a node environment will get.

Less than a year ago, I would have told you to build this image and push it to a Docker registry to use as a base image for other node-gyp related builds, and in fact, I did. However, with the advent of Docker multi-stage builds, the extra step is no longer necessary. In fact I would actually say it not recommended anymore due to the requirement of keeping multiple images up to date. Instead let’s just continue on with our pipeline, and make use of multi-stage builds to productionize our build later on. First, we need to define what our pipeline is in the context of our application.

Defining the Pipeline

Again, to have a realistic example, I’ll assume we are using babel as preprocessor, eslint as a linter, and jest as a testing tool. However, the pipeline will run just by calling npm scripts, so it should be pretty easy to substitute the tools you are using, like TypeScript, or etc.

Here is a sample scripts section of a package.json file using those tools:

// package.json

"scripts": {"start": "nodemon src/index.js --watch src node_modules --exec babel-node","build": "babel src -d dist","serve": "node dist/index.js","lint": "eslint src __tests__","lint:fix": "eslint --fix src __tests__","test": "NODE_ENV=test jest --config jest.json --coverage","test:staging": "jest --config jest.staging.json --runInBand","test:watch": "NODE_ENV=test jest --config jest.json --watch --coverage",}

In our CI process we want to cover linting, testing, and building our app, as well as building the container, and testing the container using our staging tests. By continuing on in our Dockerfile, we can cover everything except the staging tests, which requires the built container as input.

# Dockerfile continued

ADD . /srcWORKDIR /srcRUN npm installRUN npm run lintRUN npm run testRUN npm run buildRUN npm prune --production

So, we are ADDing our source code into the container, to a folder called /src, and then changing our WORKDIR to that /src directory which now contains our code. Next, we are simply running the appropriate npm scripts to install dependencies, lint our code, test our code, compile our code with build, and then removing devDependencies with npm prune --production.

Before we continue, I want to talk about the test step a little more, as that is also set up to measure coverage because we used the --coverage flag. We also passed in a jest.json file as a config. This is where coverage thresholds are defined.

// jest.json

{"testEnvironment": "node","modulePaths": ["src","/node_modules/"],"coverageThreshold": {"global": {"branches": 100,"functions": 100,"lines": 100,"statements": 100}},"collectCoverageFrom" : ["src/**/*.js"]}

If you wanted to maintain 90% coverage, you would decrease each option marked as 100 to 90. The tests will fail if the coverage threshold is not met.

If you want format your code automatically the exact same way I do, here’s the .eslintrc file I use with ESLint. And, just to make your life easier, the .babelrc file I use, with Babel.

Second Stage of Build

Our Dockerfile up until this point now starts with a new Node evironment, on Alpine linux, optionally installs node-gyp tools, and then adds, lints, tests, and compiles our code, and then finally prunes away development dependencies. What we have left are all of the artifacts we need for a production build. Unfortunately we are left with additional bloat from the tools we needed to get this far. We will use a multi-stage build to copy only the artifacts we need into our final productionized container, using COPY --from=build.

# Dockerfile continued

FROM node:9-alpine

# install curl for healthcheckRUN apk add --update --no-cache curl

ENV PORT=3000EXPOSE $PORT

ENV DIR=/usr/src/serviceWORKDIR $DIR

# Copy files from build stageCOPY --from=build /src/package.json package.jsonCOPY --from=build /src/package-lock.json package-lock.jsonCOPY --from=build /src/node_modules node_modulesCOPY --from=build /src/dist dist

HEALTHCHECK --interval=5s \--timeout=5s \--retries=6 \CMD curl -fs http://localhost:$PORT/_health || exit 1

CMD ["node", "dist/index.js"]

This completes our Dockerfile. The final size in my case is a 28MB container, which has my production node_modules, and my dist folder of babel-compiled javascript source code. To run, simply use vanilla node. I’ve also defined a health check that a scheduler like Docker Swarm can use to ensure the container is healthy. curl may not be the most efficient healthcheck, but it’s a good starting point.

Here’s the full Dockefile in one place for all of your copy and pasting needs!

FROM node:9-alpine AS build

# install gyp toolsRUN apk add --update --no-cache \python \make \g++

ADD . /srcWORKDIR /srcRUN npm installRUN npm run lintRUN npm run testRUN npm run buildRUN npm prune --production

FROM node:9-alpineRUN apk add --update --no-cache curl

ENV PORT=3000EXPOSE $PORT

ENV DIR=/usr/src/serviceWORKDIR $DIR

COPY --from=build /src/package.json package.jsonCOPY --from=build /src/package-lock.json package-lock.jsonCOPY --from=build /src/node_modules node_modulesCOPY --from=build /src/dist dist

HEALTHCHECK --interval=5s \--timeout=5s \--retries=6 \CMD curl -fs http://localhost:$PORT/_health || exit 1

CMD ["node", "dist/index.js"]

Now, simply running:

docker image build -t your-image-name .

will run 80% of our CI pipeline. Next step is testing the container with it’s integrations.

Integration Tests

For integration tests, we need to run other software, like databases, or message queues, or other services within our system, and test that they work together – that they integrate. Because this task requires running multiple images together, we will instead use docker-compose which is suited to this type of task.

Again, trying to stick with realistic, instead of over simplified examples, here is the docker-compose file I use for testing a more complicated microservice in an architecture based on CQRS and Event Sourcing, which has dependencies on redis, mongodb, and rabbitmq.

version: '2'

services:

staging-deps:image: your-image-nameenvironment:- NODE_ENV=production- PORT=3000- RABBITMQ_URL=amqp://rabbitmq:5672- REDIS_HOST=redis- REDIS_PORT=6379- MONGO_URL=mongodb://mongo:27017/inventory- DEBUG=servicebus*networks:- defaultdepends_on:- redis- rabbitmq- mongo

rabbitmq:image: rabbitmq:3.6-managementports:- 15672:15672hostname: rabbitmqnetworks:- default

redis:image: redisnetworks:- default

mongo:image: mongoports:- 27017:27017networks:- default

staging:image: node:8-alpinevolumes:- .:/usr/src/serviceworking_dir: /usr/src/servicenetworks:- defaultenvironment:- apiUrl=http://staging-deps:3000- RABBITMQ_URL=amqp://rabbitmq:5672- REDIS_HOST=redis- REDIS_PORT=6379- MONGO_URL=mongodb://mongo:27017/inventory- DEBUG=$DEBUGcommand: npm run test:staging

clean:extends:service: stagingcommand: rm -rf node_modules

install:extends:service: stagingcommand: npm run install

Pay particular attention to staging-deps and staging services.

staging-deps is running the image that was produced from running docker image build command we ran earlier. This happens by setting image: to the tag we set in the build command with -t. We are passing it a bunch of environment variables to let our service know how to connect to the different containers running in our network, which is the default network. Each docker-compose file can define networks, and has a default network by default. Docker also handles service discovery through it’s Software Defined Networks (SDNs), so the host name will resolve to the IP address of the container with the same name on the network. For example, in MONGO_URL=mongodb://mongo:27017/inventory mongo will resolve to the mongo container in the SDN. Lastly, depends_on will tell docker to start the depended on containers first when bringing them up.

staging is a container based on node and with our source code mounted into it using the volumes directive. The integration tests will be run from this container.

Additionally, there are two more “services” that we will also run as scripts: clean and install. clean to ensure that we are using the correct versions of node_modules seems how we may be switching between developing on our host OS such as Mac, and Alpine Linux, which our containers are based on.

First we will bring up the dependencies:

docker-compose -f docker-compose.staging.yml up -d staging-deps

Depending on whether or not your service accounts for retrying connections, you may want to simply start certain containers first, and sleep while they become ready. Alternatively, scripts like wait-for-it.sh can be used to accomplish this with a shell script.

Once your dependencies are up and running, you’ll need to run the staging tests. We’ve mounted the code into the container, but haven’t installed npm dependencies on it. Run the following to install dependencies for the linux container, and then run staging tests.

docker-compose -f docker-compose.staging.yml run --rm installdocker-compose -f docker-compose.staging.yml run --rm staging

If your integration/staging tests pass, you’ll be confident you can continue on with the delivery and deployment stages! But first, let’s automate running our pipeline, so we can run it on each commit.

Automating the pipeline

First, let’s create a Makefile. Our goal is to be able to run make ci to run the whole CI process.

ci:make docker-build \clean \install \staging \staging-down

docker-build:docker build -t your-image-name .

clean:docker-compose -f docker-compose.staging.yml run --rm clean

install:docker-compose -f docker-compose.staging.yml run --rm install

staging:docker-compose -f docker-compose.staging.yml up -d staging-depsdocker-compose -f docker-compose.staging.yml run --rm staging

staging-down:docker-compose -f docker-compose.staging.yml down

Running on every commit

I run on every commit locally, using a npm package called husky and again on my CI server, which typically is Jenkins 2.0 Pipelines.

To run locally let’s start by installing husky to our devDependencies by running

npm i --save-dev husky

Husky makes it super simple to install and run git hooks. Simply open up your package.json and add the following two scripts:

"scripts": {// ..."precommit": "npm run lint:fix && npm run test","prepush": "make ci"},

Now on every commit, on every developer’s machine, husky will fix all linting errors that can be automatically fixed, and run unit tests. Our tests are configured to have 100% coverage, so if any tests do not pass, or do not meet coverage thresholds, the commit will be blocked. Additionally, on every attempt to push, the entire CI process we defined will be run. This is useful for catching any deps that may have been installed locally but not saved to package.json, and ensures that everything passes when running in the OS it will be hosted on in production, as well as testing the container that is built using staging tests.

Conclusion

At this point you have everything you need to run an entire Dockerized Continuous Integration process on every commit, and prevent bad code from entering your code base. But this is far from the end of the journey. Follow me for more posts about similar topics!

As for next steps:

**Improving Continuous Integration**Build from this point to customize for your needs. If you are making an app, it might make sense to also run end-to-end tests with Selenium, or one of it’s flavors. If it’s a high load site, add stress testing to the pipeline. Any types of tests like these belong in the integration phase.
Continuous Delivery Once you are building production-ized docker containers, it’d be a shame not to put them in a registry. There are paid solutions for this, and self hosted registry’s. What works best for you is up to your situation. I typically just pay for private repos on Docker Hub and call it a day.
**Continuous Deployment**This is where the magic truly begins to happen. When you have fully tested production optimized containers deploying themselves on every commit to master, it’s truly zen. Especially given modern microservice architectures. Typically, we are adding new services to add new features, rather than a branch, making you miss branches even less. For Continuous Deployment servers, I prefer Jenkins 2.0 Pipelines with Blue Ocean. Jenkinsfile's allow the engineer’s of each product to define their own CD pipeline’s, and Docker Stack files allow the developers to define how their services will run in production. A CD process might also involve updating a proxy, such as HAProxy or nginx.
**Running in Production**This is what orchestrators are for. These are tools like Docker Swarm, Kubernetes, and Mesos. The work very similarly to schedulers on the computer you are using now, that schedule processes and allocate resources that run on your machine. Instead, they are scheduling and reserving resources across a cluster of machines, letting you control them as if they were one.