Imagine that you’ve got your new shiny CouchDB cluster working in production and then a node goes down. If your design is solid and you have enough CouchDB nodes still left running, your app should keep chugging along. You on the other hand, will want to know why a node went down so that you can determine whether you need to tweak things like the memory or CPU on your nodes. You may also want to receive notifications via email or slack so that you can diagnosis things in real-time.
Prometheus is an open source monitoring system that allows you to treat monitoring metrics as a data source for generating alerts and reports. It was originally built by SoundCloud and has since joined the Cloud Native Computing Foundation. There is a thriving community around Prometheus and a lot of developers have contributed plugins (known as exporters) for many web services, including CouchDB.
At first glance, it can be a little overwhelming trying to set up all the pieces of a Prometheus environment, but with Docker we can easily spin them up on the same box and then later move them to other boxes to scale our environment.
The following tutorial will cover the setup on Ubuntu, but it can easily be adapted to work on any other OS that runs Docker.
$ git clone https://github.com/redgeoff/docker-ce-vagrant$ cd docker-ce-vagrant$ sudo ./docker.sh
The AlertManager is a service that determines what to do when alerts are generated.
Let’s create a directory to house our prometheus configuration files:
$ mkdir /home/ubuntu/prometheus
Now, edit /home/ubuntu/prometheus/alertmanager.conf with the following configuration. Our configuration instructs the AlertManager to send an email via a Gmail account and send a message to our #alerts slack channel. You’ll need to enable Incoming Webhooks and replace WEBHOOK-URL with your slack webhook URL. You’ll also need to replace GMAIL-PASSWORD. For increased security, you may want to create a new Gmail account so that you don’t have to use your primary Gmail password.
global:slack_api_url: 'WEBHOOK-URL'
route:receiver: 'all-alerts'group_by: ['CouchDBDownAlert', datacenter, app]
receivers:- name: 'all-alerts'email_configs:
to: '[email protected]'from: '[email protected]'smarthost: smtp.gmail.com:587auth_username: "[email protected]"auth_identity: "[email protected]"auth_password: "GMAIL-PASSWORD"slack_configs:
channel: '#alerts'send_resolved: true
You can of course comment out either the email_configs or slack_configs if don’t want one or the other. There are also many other supported configs.
Then, run the alert manager with the following command. This command will automatically restart the manager on a server reboot or if the manager crashes.
sudo docker run -d --name alertmanager \--restart always \-p 9093:9093 \-v /home/ubuntu/prometheus:/alertmanager \prom/alertmanager \--config.file=/alertmanager/alertmanager.conf
If you ever make a change to alertmanager.conf, you can restart the AlertManager with sudo docker restart alertmanager
. (You’ll also have to restart the Prometheus Server)
Prometheus uses rules to define alerts, which are triggered when a certain condition becomes true.
Edit /home/ubuntu/prometheus/prometheus.rules and create a CouchDBDownAlert, which will be triggered whenever a CouchDB node is not reporting its status or is reported to be down.
groups:- name: example.rulesrules:
alert: CouchDBDownAlertexpr: absent(couchdb_httpd_up) or couchdb_httpd_up < 1for: 1mannotations:summary: CouchDB Node Down
Prometheus gathers metrics by polling different targets. By default, CouchDB doesn’t provide metrics in the format needed by Prometheus, but fortunately, Tobias Gesellchen has developed the awesome couchdb-prometheus-exporter!
Only a single instance of the exporter will need to be run to cover your entire CouchDB cluster as the exporter automatically reports metrics for each node.
Let’s assume that your cluster is listening on the default port of 5984. Run the exporter with:
sudo docker run -d --name couchdb-prometheus-exporter \--restart always \-p 9984:9984 \gesellix/couchdb-prometheus-exporter \-couchdb.uri=http://IP_OR_DNS_TO_ANY_COUCH_NODE:5984 \-couchdb.username=COUCH_USER \-couchdb.password=COUCH_PASSWORD
You’ll need the IP address of the server running the exporter. On Linux, you can normally get this with:
/sbin/ip route | awk '/eth0 proto/ { print $9 }'
Let’s configure Prometheus to poll itself and our couchdb-prometheus-exporter. Edit /home/ubuntu/prometheus/prometheus.conf:
global:scrape_interval: 15s
external_labels:monitor: exporter-metrics
rule_files:
alerting:alertmanagers:
scrape_configs:
job_name: prometheus
scrape_interval: 5s
static_configs:
Then run Prometheus with:
sudo docker run -d --name prometheus-server -p 9090:9090 \--restart always \-v /home/ubuntu/prometheus/prometheus.conf:/prometheus.conf \-v /home/ubuntu/prometheus/prometheus.rules:/prometheus.rules \prom/prometheus \--config.file=/prometheus.conf
This will automatically restart Prometheus on a server reboot or if it crashes.
At this point, you can now visit http://IP_OR_DNS_TO_PROMETHEUS_SERVER:9090 to view the Prometheus UI. The Graph tab gives you a quick way of viewing a graph for a particular metric. In the next step however, we’ll go a step further and install Grafana so that we can easily build beautiful custom dashboards.
To test your alerts, stop one of your CouchDB nodes and you should get an email and a slack notification. Then, restart the node and you should again be notified. Pretty cool!
Grafana is an open platform for beautiful analytics and monitoring. Grafana gives us the ability to create custom dashboards and persists our data. We can compare our reports with historic data to determine when things are breaking or even starting to break.
Create a volume for Grafana so that our data persists across restarts:
sudo docker volume create grafana-storage
Run Grafana:
sudo docker run -d --name grafana -p 3000:3000 \--restart always \-v grafana-storage:/var/lib/grafana \grafana/grafana
You can then access the web UI via http://IP_OR_DNS_TO_GRAFANA_SERVER:3000 and log in with admin/admin. Once you have logged in, click on the gear icon in the sidebar, select Data Sources, click the Add data source button and then fill in the following details:
You can then create a new Dashboard and add a graph for a metric like couchdb_httpd_up or couchdb_httpd_request_time. So cool!
You may be wondering how to prevent others from accessing your reports. One of the easiest ways is to put everything behind a firewall, make sure all the containers are connected via private IP addresses and then use a VPN server to tunnel traffic into your virtual cloud.
Another thing that we can do to tighten our security is to move our passwords, e.g. the couchdb.password, to a Docker .env file. And, if you happen to be using Docker Swarm, an even better option is to use Docker Swarm Secrets.
Geoff Cox is the Co-Founder of Quizster, a digital dropbox and grading system. Quizster, uses a full stack of JS and runs CouchDB and PouchDB at the data layer.