bazel

Bazel – For Fast, Correct, Reproducible Builds

In my previous blog, i’ve shown bazel in action by building a solr cloud package. In this blog i’m going to explain a bit more about Bazel.

Bazel is the Open Source version of Google’s internal build tool Blaze. Bazel is currently in beta state, but it has been used by a number of companies in production. Bazel has some quite interesting features. Bazel has a good caching mechanism. It caches all input files, all external dependencies etc … Before running the actual build, bazel will first check the existing cache and if the cache is valid. If valid, then bazel will try to check if there are any changes to the input files/ dependencies. If it detect any changes, then bazel will start re-building the package. We can also use bazel to build our test targets and can make bazel to run our unit/integration tests for the built targets. Bazel can also detect cyclic dependencies with in the code. Another important feature is sandboxing. On Linux, Bazel can run build/test inside a sandboxed environment and can detect file leaks or broken dependencies. This is because, during sandbox mode, bazel will mount only the specified input files, data dependencies on to the sandbox environment.

Bazel Build Flow

Let’s see how the bazel build process flow works. First thing that we need is a WORKSPACE file. A bazel workspace is a directory that contains the source files for one or more software projects, as well as a WORKSPACE file and BUILD files that contain the instructions that Bazel uses to build the software. It also contains symbolic links to output directories in the Bazel home directory

Let’s create a simple workspace for testing

$ mkdir bazel-test && cd bazel-test

$ touch WORKSPACE

Now i’m going to build a simple python package. hello.py is a simple python script which imports a hello function from dep.py. So our primary script is hello.py which has a dependency on dep.py

vagrant@trusty-docker:~/bazel-test$ cat hello.py
from dep import hello
print hello("Building a simple python package with Bazel")

vagrant@trusty-docker:~/bazel-test$ cat dep.py
def hello(msg):
    return msg

The Bazel’s build command basically looks for a BUILD file on the target location. This file should contain the necessary bazel build rules. Bazel’s Python Rule Documentation explains the list of rules that are supported. Applying this to our test scripts, we are going to build a py_binary for our hello.py and this binary has a py_library dependency towards dep.py. So our final BUILD file will be,

py_library(
  name = 'dep',
  srcs = ['dep.py'],
)

py_binary(
  name = 'hello',
  srcs = ['hello.py'],
  deps = [':dep'],    # our dependency towards `dep.py`
)

So we have the BUILD file now, let’s kick off a build

vagrant@trusty-docker:~/bazel-test$ bazel build hello
............
INFO: Found 1 target...
Target //:hello up-to-date:
  bazel-bin/hello
INFO: Elapsed time: 4.564s, Critical Path: 0.03s

woohoo, so bazel has build the package for us. Now if we check our workspace, we will see a bunch of bazel-* symlinks. These directories points to the bazel home directory where our final build output lies.

vagrant@trusty-docker:~/bazel-test$ tree -d
.
├── bazel-bazel-test -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/__main__
├── bazel-bin -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/bin
├── bazel-genfiles -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/genfiles
├── bazel-out -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/__main__/bazel-out
└── bazel-testlogs -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/testlogs

So our new python binary is available in bazel-bin/hello. Also, bazel creates something called runfiles which exists next to the binary. Bazel actually copies our dependencies (input files and data dependencies) onto this runfiles folder.

-r-xr-xr-x 1 vagrant vagrant 4364 Feb 19 20:13 bazel-bin/hello
vagrant@trusty-docker:~/bazel-test$ ls -l bazel-bin/hello
hello                    hello.runfiles/          hello.runfiles_manifest
vagrant@trusty-docker:~/bazel-test$ ls -l bazel-bin/hello.runfiles/__main__/
total 4
lrwxrwxrwx 1 vagrant vagrant  31 Feb 19 20:13 dep.py -> /home/vagrant/bazel-test/dep.py
lrwxrwxrwx 1 vagrant vagrant 130 Feb 19 20:13 hello -> /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/bin/hello
lrwxrwxrwx 1 vagrant vagrant  33 Feb 19 20:13 hello.py -> /home/vagrant/bazel-test/hello.py

If we go through our python binary bazel-bin/hello, it’s nothing but a wrapper script which basically identifies our runfiles directory path, add this runfiles path to the PYTHONPATH env variable and then invokes our hello.py file. In the beginning, i’ve mentioned that bazel has a good caching mechanism. Let’s re-run the build command and see the output, especially the time taken to complete the build process.

vagrant@trusty-docker:~/bazel-test$ bazel build hello
INFO: Found 1 target...
Target //:hello up-to-date:
  bazel-bin/hello
INFO: Elapsed time: 0.247s, Critical Path: 0.00s

Let’s compare the build time for both the build process. The first build process took ~ 4.5 sec. But the second one is ~ 0.2 sec. This is because, bazel didnt run real build process during the second run. It actually verified the input files against its cache and found no change.

Now let’s add a simple unit test and see how bazel can run the same.

vagrant@trusty-docker:~/bazel-test$ cat hello_test.py
import unittest

from dep import hello

class TestHello(unittest.TestCase):

  def test_hello(self):
    self.assertEquals(hello("test message"), "test message")

if __name__ == '__main__':
  unittest.main()

Now let’s add a py_test rule to our BUILD file so that bazel can use it with bazel test.

py_test(
    name = "hello_test",
    srcs = ["hello_test.py"],
    deps = [
        ':dep',
    ],
)

We have the py_test rule, now let’s run the bazel test command and verify.

vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
Target //:hello_test up-to-date:
  bazel-bin/hello_test
INFO: Elapsed time: 2.255s, Critical Path: 0.06s
//:hello_test                                                            PASSED in 0.0s

Executed 1 out of 1 test: 1 test passes.

woohoo the test seems to run fine. Now let’s manually break the test and see if bazel is picking the failure also.

vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
FAIL: //:hello_test (see /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/testlogs/hello_test/test.log).
Target //:hello_test up-to-date:
  bazel-bin/hello_test
INFO: Elapsed time: 0.199s, Critical Path: 0.05s
//:hello_test                                                            FAILED in 1 out of 2 in 0.0s
  /home/vagrant/.cache/bazel/_bazel_vagrant/9dedbe0729180ec68a026adfb67cba5d/execroot/bazel-test/bazel-out/local-fastbuild/testlogs/hello_test/test.log

Executed 1 out of 1 test: 1 fails locally.

Bingo, bazel is detecting the test failure too. During our build process we saw that bazel caches the build and doesnt re-run the build process unless it desont detect any changes to the dependencies. Now lets see if what bazel does with tests too.

vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
Target //:hello_test up-to-date:
  bazel-bin/hello_test
INFO: Elapsed time: 0.169s, Critical Path: 0.04s
//:hello_test                                                            PASSED in 0.0s

Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
vagrant@trusty-docker:~/bazel-test$ bazel test hello_test
INFO: Found 1 test target...
Target //:hello_test up-to-date:
  bazel-bin/hello_test
INFO: Elapsed time: 0.087s, Critical Path: 0.00s
//:hello_test                                                   (cached) PASSED in 0.0s

Executed 0 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

Bingo, we can see the (cached) line in the output of the second tests run. So like the build process, bazel does caches the tests too.

Customizing Bazel Rules

py_binary, py_library etc… are the default bazel python rules which comes with bazel. Unlike any other product, we might endup in cases where we need to have custom rules to solve our specific needs. And the good news is, Bazel comes with an extension called skylark. With skylark, we can create custom build rules matching our requirements. Skylark syntax are pretty similar to python. I’ll be writing a more detailed blog on skyalrk soon 🙂

Conclusion

Though bazel is still in beta, it seems to be a really interesting tool for building hermetic packages. Bazel does has the ability to detect cylic dependencies and dependency leaks which is really an important thing. The caching ability of bazel really helps us to build faster packages compared to other traditional build tools.

Advertisements
Standard
bazel, solr

Bazeling Solr Cloud

It’s been a long time since i wrote my last blog. And this time i decided to write about something which ive been working for the last few months. Its Bazel. We have been using bazel heavily in production for the last couple of months and results seems to be pretty good. Leonid from our Infra team recently gave a talk about how we use bazel to build hermetic packages. I’ll be writing some detailed blog on how to play with bazel. But in this blog, we will be seeing bazel in action only.

This time i’m going to build a bazel package for solr cloud v6.3.0 and going to use this bazel package to spin up a solr cloud service.

Requirements

Basically there are 3 main dependencies,

  1. Bazel, for building our solr package
  2. Java 8 (as latest versions of both bazel and solr requires java8)
  3. ZooKeeper Cluster

Solr uses Zookeeper as a repository for cluster configuration and coordination. There are tons of blogs on how to setup a simple ZK cluster, so i’m gonna skip that part. My test setup has a single node ZK cluster.

Setting up Bazel Solr Package

Bazel install page is well documented on how to setup bazel locally. If we go through the solr documentation, there are a bunch of variables that the bin/solr wrapper script looks for, especially when we want to customize our solr settings. On a local test setup, we dont care about such customization, but live environment, we definitely need to tweak things like JAVA_HOME or our SOLR_HOME directory where solr stores the data or even ZK hosts list.

My Bazel package is going to be pretty straight forward, it will have a shell binary which is basically a wrapper script. And this script will have two bazel data dependencies, 1) solr cloud source file and 2) solr config files. This wrapper binary makes sure that all the necessary runtime variables are set and the SOLR_HOME contains all necessary config files including various configs for the collections too. This wrapper binary will be using the solr source that is embedded with in the bazel’s runfiles folder (where bazel keeps all data dependencies for a specific build rule).

We need to teach bazel where to fetch the solr source file. So lets create a workspace and add tell bazel where to look for the solr source. Add the below lines to WORKSPACE file

# Custom Solr version to use
new_http_archive(
      name = 'solr_ver630',
      url = 'http://apache.mirror.anlx.net/lucene/solr/6.3.0/solr-6.3.0.tgz',
      sha256 = '07692257575fe54ddb8a8f64e96d3d352f2f533aa91b5752be1869d2acf2f544',
      build_file = 'tools/BUILD.extract',
      strip_prefix = 'solr-6.3.0',
)

My BUILD.extract is pretty simple, it just exposes all the files present inside the extracted tar file

package(default_visibility = ["//visibility:public"])

# Exposes all files in a package as `@workspace_name//:files`
filegroup(
  name = 'files',
  srcs = glob(['**/*']),
)

Let’s create a folder for keeping our various solr config files like solr.xml, solrconfig.xml etc… Copy the necessary config files and expose them via a BUILD file. We can either use glob to expose everything blindly or we can simply create a list of files which we want to expose. If we create a list with specific file names, then bazel will expose only those files. In my case, i’m gonna use the glob similar to what i’m using in BUILD.extract file.

Now let’s add our shell binary rule. I’m going to keep the source file that the shell binary rule uses inside a separate directory instead of polluting the workspace. The shell binary rule build rule is,

package(default_visibility = ["//visibility:public"])

sh_binary(
  name = 'start_solr',
  srcs = ['start_solr.sh'],      # This is the src of the wrapper script
  data = [
      '@solr_ver630//:files',    # -> Adding solr source as a data dep
      '//configs/solr:files',    # -> Adding our configs folder as a data dep
  ],
)

My final directory structure looks to be,

├── configs
│   └── solr
│       ├── BUILD
│       ├── collections
│       │   └── vader          #  Follow @depresseddarth :)
│       │       ├── protwords.txt
│       │       ├── schema.xml
│       │       ├── solrconfig.xml
│       │       └── stopwords.txt
│       ├── log4j.properties
│       ├── solr.in.sh
│       └── solr.xml
├── services
│   └── solr
│       ├── BUILD
│       └── start_solr.sh
├── tools
│   └── BUILD.extract
└── WORKSPACE

We have all the rules ready to kick of the bazel build process.

bazel build //services/solr:start_solr
INFO: Found 1 target...
Target //services/solr:start_solr up-to-date:
  bazel-bin/services/solr/start_solr
INFO: Elapsed time: 17.465s, Critical Path: 0.82s

Bazel has successfully completed the build process and the final package files are available inside bazel-bin/ with in the workspace. Below is the final layout of the bazel-bin directory.

bazel-bin
 └── services
    └── solr
        ├── start_solr
        └── start_solr.runfiles   # bazel runfiles folder where all data depeare present
            ├── __main__
            │   ├── configs       # `config` folder which was added as a data dep
            │   │   └── solr
            │   │       └── collections
            │   │           └── vader
            │   ├── external
            │   │   └── solr_ver630   # embedded solr source bianry. The wrapper script will be using the solr binary from this directory
            │   └── services
            │       └── solr
            |           ├── start_solr
            |           └── start_solr.sh
            └── solr_ver630

Spin up Solr Cloud Service

First, let’s spin up our ZK node

$ bin/start-zk-server

Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
    2017-02-16 17:34:25,126 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration from: /usr/local/zookeeper/bin/../conf/zoo.cfg
    2017-02-16 17:34:25,130 [myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 20
    2017-02-16 17:34:25,130 [myid:] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 1
    2017-02-16 17:34:25,131 [myid:] - WARN  [main:QuorumPeerMain@113] - Either no config or no quorum defined in config, running  in standalone mode
    2017-02-16 17:34:25,132 [myid:] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
    2017-02-16 17:34:25,146 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration from: /usr/local/zookeeper/bin/../conf/zoo.cfg
    2017-02-16 17:34:25,146 [myid:] - INFO  [main:ZooKeeperServerMain@95] - Starting server
    2017-02-16 17:34:25,151 [myid:] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:zookeeper.version=3.4.6--1, built on 05/31/2016 17:14 GMT
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:host.name=localhost
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:java.version=1.8.0_112
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:java.vendor=Oracle Corporation
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:java.home=/usr/lib/jvm/jdk-8-oracle-x64/jre
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:java.class.path=/usr/local/zookeeper/bin/../build/classes:/usr/local/zookeeper/bin/../build/lib/*.jar:/usr/local/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/usr/local/zookeeper/bin/../lib/log4j-1.2.16.jar:/usr/local/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/bin/../zookeeper-3.4.6.jar:/usr/local/zookeeper/bin/../src/java/lib/*.jar:/usr/local/zookeeper/bin/../conf:
    2017-02-16 17:34:25,155 [myid:] - INFO  [main:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
    2017-02-16 17:34:25,156 [myid:] - INFO  [main:Environment@100] - Server environment:java.io.tmpdir=/tmp
    2017-02-16 17:34:25,157 [myid:] - INFO  [main:Environment@100] - Server environment:java.compiler=<NA>
    2017-02-16 17:34:25,157 [myid:] - INFO  [main:Environment@100] - Server environment:os.name=Linux
    2017-02-16 17:34:25,157 [myid:] - INFO  [main:Environment@100] - Server environment:os.arch=amd64
    2017-02-16 17:34:25,157 [myid:] - INFO  [main:Environment@100] - Server environment:os.version=3.16.0-33-generic
    2017-02-16 17:34:25,157 [myid:] - INFO  [main:Environment@100] - Server environment:user.name=root
    2017-02-16 17:34:25,158 [myid:] - INFO  [main:Environment@100] - Server environment:user.home=/root
    2017-02-16 17:34:25,158 [myid:] - INFO  [main:Environment@100] - Server environment:user.dir=/home/sentinelleader
    2017-02-16 17:34:25,159 [myid:] - INFO  [main:ZooKeeperServer@823] - tickTime set to 2000
    2017-02-16 17:34:25,159 [myid:] - INFO  [main:ZooKeeperServer@832] - minSessionTimeout set to -1
    2017-02-16 17:34:25,159 [myid:] - INFO  [main:ZooKeeperServer@841] - maxSessionTimeout set to -1
    2017-02-16 17:34:25,170 [myid:] - INFO  [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181
    2017-02-16 17:34:26,144 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /127.0.0.1:26110
    2017-02-16 17:34:26,205 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@929] - Client attempting to renew session 0x15a4747c2ba0037 at /127.0.0.1:26110
    2017-02-16 17:34:26,208 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@674] - Established session 0x15a4747c2ba0037 with negotiated timeout 15000 for client /127.

Once ZK is up, let’s try using our newly built wrapper binary to start the solr service. We also need to create a zk node. This node will be used in the ZK_HOST params.

$ solr-6.3.0/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd makepath /solr

# lets verify if the node is created successfully

$ solr-6.3.0/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd get /solr/clusterstate.json
{}             # the output should be a null json

Let’s start our solr node

$ bazel-bin/services/solr/start_solr

Rotating solr logs, keeping a max of 9 generations
2017-02-16 19:41:39.222 INFO  (main) [   ] o.e.j.s.Server jetty-9.3.8.v20160314
2017-02-16 19:41:39.540 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter  ___      _       Welcome to Apache Solr™ version 6.3.0
2017-02-16 19:41:39.544 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter / __| ___| |_ _   Starting in cloud mode on port 9301
2017-02-16 19:41:39.544 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter \__ \/ _ \ | '_|  Install dir: /home/sll/.cache/bazel/_bazel_sll/8c5b18e68ee8d852703298c6bc6863a4/external/solr_ver630
2017-02-16 19:41:39.616 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter |___/\___/_|_|    Start time: 2017-02-16T19:41:39.546Z
2017-02-16 19:41:39.638 INFO  (main) [   ] o.a.s.c.SolrResourceLoader Using system property solr.solr.home: /tmp/solr-home
2017-02-16 19:41:39.698 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter Loading solr.xml from SolrHome (not found in ZooKeeper)
2017-02-16 19:41:39.700 INFO  (main) [   ] o.a.s.c.SolrXmlConfig Loading container configuration from /tmp/solr-home/solr.xml
2017-02-16 19:41:40.040 INFO  (main) [   ] o.a.s.u.UpdateShardHandler Creating UpdateShardHandler HTTP client with params: socketTimeout=600000&connTimeout=60000&retry=true
2017-02-16 19:41:40.045 INFO  (main) [   ] o.a.s.c.ZkContainer Zookeeper client=127.0.0.1:2181/solr
2017-02-16 19:41:40.139 INFO  (main) [   ] o.a.s.c.OverseerElectionContext I am going to be the leader localhost:9301_solr
2017-02-16 19:41:40.143 INFO  (main) [   ] o.a.s.c.Overseer Overseer (id=97469495097163793-localhost:9301_solr-n_0000000001) starting
2017-02-16 19:41:40.238 INFO  (main) [   ] o.a.s.c.ZkController Register node as live in ZooKeeper:/live_nodes/localhost:9301_solr
2017-02-16 19:41:40.243 INFO  (zkCallback-5-thread-1-processing-n:localhost:9301_solr) [   ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (0) -> (1)
2017-02-16 19:41:40.321 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator Found 0 core definitions underneath /tmp/solr-home
2017-02-16 19:41:40.380 INFO  (main) [   ] o.e.j.s.ServerConnector Started ServerConnector@5386659f{HTTP/1.1,[http/1.1]}{0.0.0.0:9301}
2017-02-16 19:41:40.380 INFO  (main) [   ] o.e.j.s.Server Started @1749m

Woohoo the solr node has started successfully. Let’s manually check the status of running service.

$ bin/solr status

Found 1 Solr nodes:

Solr process 15066 running on port 9301
{
  "solr_home":"/tmp/solr-home",
  "version":"6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:52:42",
  "startTime":"2017-02-16T19:42:36.830Z",
  "uptime":"0 days, 0 hours, 0 minutes, 36 seconds",
  "memory":"17.4 MB (%3.6) of 122.7 MB",
  "cloud":{
    "ZooKeeper":"127.0.0.1:2181/solr",
    "liveNodes":"1",
    "collections":"0"}}

woohoo the service up and running. Now lets create our vader collection.

$ solr-6.3.0/bin/solr create -c vader -d /tmp/solr-home/configsets/vader/conf/ -shards 1 -replicationFactor 1

Connecting to ZooKeeper at 127.0.0.1:2181/solr ...
Uploading /tmp/solr-home/configsets/vader/conf for config vader to ZooKeeper at 127.0.0.1:2181/solr

Creating new collection 'vader' using command:
http://localhost:9301/solr/admin/collections?action=CREATE&name=vader&numShards=1&replicationFactor=1&maxShardsPerNode=1&collection.configName=vader

{
  "responseHeader":{
    "status":0,
    "QTime":3334},
  "success":{"localhost:9301_solr":{
      "responseHeader":{
        "status":0,
        "QTime":2039},
      "core":"vader_shard1_replica1"}}}

Let’s also verify that our configs are uploaded to ZooKeeper

$ /usr/local/zookeeper/bin/zkCli.sh

Connecting to localhost:2181

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0] ls /solr/
configs             overseer            aliases.json        live_nodes          collections
overseer_elect      security.json       clusterstate.json

[zk: localhost:2181(CONNECTED) 0] ls /solr/configs/vader/

schema.xml       protwords.txt    solrconfig.xml   synonyms.txt     stopwords.txt

We can also verify our collection state from zookeeper

[zk: localhost:2181(CONNECTED) 0] get /solr/collections/vader/state.json
{"vader":{
    "replicationFactor":"1",
    "shards":{"shard1":{
        "range":"80000000-7fffffff",
        "state":"active",
        "replicas":{"core_node1":{
            "core":"vader_shard1_replica1",
            "base_url":"http://localhost:9301/solr",
            "node_name":"localhost:9301_solr",
            "state":"active",
            "leader":"true"}}}},
    "router":{"name":"compositeId"},
    "maxShardsPerNode":"1",
    "autoAddReplicas":"false"}}

woohoo our solr cloud service is running with the newly created collection. And we now have a fully hermetic package with all the dependencies embedded with in. Bazel has a pretty good caching mechanism, so it will not rebuild the package everytime nor re-download the external dependencies when we run the same build command again and again. We can also use bazel to bundle our packages. Currently bazel can create tar/deb packages and even docker images. Bazel has got lot of interesting features which i’ll explain in detail in my upcoming posts on bazel 🙂

Standard
confd, Docker, etcd, serf

Scaling HA-Proxy with Docker, ConfD, Serf, ETCD

Containers, Microservices are the hottest topics in the IT world now. Container technology like LXC, Docker, rkt etc are being used heavily in production now. It's easy to use these technologies on stateless services, so even if a container is crashed, a new container can be launched within seconds and can start serving traffic. Containers are widely used for serving web traffics. Tools like Marathon, provides a HA-Proxy based service discovery, wherein it reloads the HA-Proxy service based on the start/stop of the containers in the Mesos cluster. A simplest similar method is to use etcD with Docker and Haproxy.

With this we can apply the same logic here also ie, when a container is launched, we can save the IP and Port that is being mapped from the HOST machine and the same can be updated dynamically to HA-Proxy config file using ConfD and reload the HA-Proxy service. Now imagine if a container is terminated due to some reason, the HA-Proxy host check will fail, but the entry will be still there in the haproxy.cfg. Since we will be launching new containers either manually or using frameworks like Marathon/Helios etc, there is no need to to keep the old entries in the haproxy.cfg as well as in the ETCD.

But we need a way to keep track of the containers which got terminated, so that we can remove those entries. Enter serf, Serf is a decentralized solution for cluster membership, failure detection, and orchestration. Serf relies on an efficient and lightweight gossip protocol to communicate with nodes. The Serf agents periodically exchange messages with each other in much the same way that a zombie apocalypse would occur: it starts with one zombie but soon infects everyone. Serf is able to quickly detect failed members and notify the rest of the cluster. This failure detection is built into the heart of the gossip protocol used by Serf.

We will be running Serf agents on each container. These serf clusters are isolated ie, we can control the joining members, thereby maintaining separate serf clusters. So if a Container goes down, other agents will receive this member-failed event, and we can define an event handler that can preform the key removal from the etcD which in turn invoke the confD. Then ConfD will remove the server entry from the HA-Proxy config file.

Now let's see this in action ;), i'm using Ubuntu 14.04 as the Base Host

Setting up HA-Proxy and Docker Web Containers

Let's install HA-Proxy on the base Host,

$ apt-get install haproxy

Enable the Ha-Proxy service in /etc/default/haproxy and start the service

$ /etc/init.d/haproxy start

Now install Docker container,

$ curl -sSL https://get.docker.com/ | sh

Once Docker is installed, lets pull an Ubuntu image from the Docker Registry

$ docker pull ubuntu:14.04

Setting up ETCD

Download the latest version of EtcD

$ curl -L  https://github.com/coreos/etcd/releases/download/v2.1.3/etcd-v2.1.3-linux-amd64.tar.gz -o etcd-v2.1.3-linux-amd64.tar.gz

$ tar xvzf etcd-v2.1.3-linux-amd64.tar.gz

Let's start the EtcD services,

$ ./etcd -name default --listen-peer-urls "http://172.17.42.1:2380,http://172.17.42.1:7001" --listen-client-urls "http://172.17.42.1:2379,http://172.17.42.1:4001" --advertise-client-urls "http://172.17.42.1:2379,http://172.17.42.1:4001" --initial-advertise-peer-urls "http://172.17.42.1:2380,http://172.17.42.1:7001" --initial-cluster 'default=http://172.17.42.1:2380,default=http://172.17.42.1:7001'

where 172.17.42.1 is my Host machine IP

Setting up ConfD

Let's setup the ConfD service. For ConfD, we basically need two files, 1) a toml file which contains the info like which keys to lookup in etcd, which etcd cluster lookup, path to the template file etc… and 2) a tmpl (template) file

Download the latest version of confd

$ wget https://github.com/kelseyhightower/confd/releases/download/v0.10.0/confd-0.10.0-linux-amd64

$ cp -rvf confd-0.10.0-linux-amd64 /usr/local/bin/

$ mkdir -p /etc/confd/{conf.d,templates}

Now lets create the tmpl and toml file

toml: /etc/confd/conf.d/myconfig.toml

[template]
src = "haproxy.cfg.tmpl"
dest = "/etc/haproxy/haproxy.cfg"
keys = [
  "/proxy/frontend",
]
reload_cmd = "/etc/init.d/haproxy reload"

tmpl: /etc/confd/templates/haproxy.cfg.tmpl

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    user haproxy
    group haproxy
    daemon

defaults
    log    global
    mode    http
    option    httplog
    option    dontlognull
        contimeout 5000
        clitimeout 50000
        srvtimeout 50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend localnodes
    bind *:80
    mode http
    default_backend nodes

backend nodes
    mode http
    balance roundrobin
    option forwardfor
{{range gets "/proxy/frontend/*"}}
    server {{base .Key}} {{.Value}} check
{{end}}

Start the service in foreground by running the below command,

$ confd -backend etcd -node 172.17.42.1:4001 --log-level="debug" -interval=10  # confd will perform lookup for every 10sec

For now, there is no key present in the ETCD, ConfD will perform md5sum lookup of the current file and expected state of the file, if there is any change it will perform updating the file. Below are the debug logs from the ConfD service

2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: INFO Backend set to etcd
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: INFO Starting confd
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: INFO Backend nodes set to 172.17.42.1:4001
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Loading template resources from confdir /etc/confd
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Loading template resource from /etc/confd/conf.d/myconfig.toml
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Retrieving keys from store
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Key prefix set to /
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Using source template /etc/confd/templates/haproxy.cfg.tmpl
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Compiling source template /etc/confd/templates/haproxy.cfg.tmpl
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Comparing candidate config to /etc/haproxy/haproxy.cfg
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: INFO /etc/haproxy/haproxy.cfg has md5sum aa00603556cb147c532d5d97f90aaa17 should be fadab72f5cef00c13855a27893d6e39c
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: INFO Target config /etc/haproxy/haproxy.cfg out of sync
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Overwriting target config /etc/haproxy/haproxy.cfg
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG Running /etc/init.d/haproxy reload
2015-09-07T07:09:48Z vagrant-ubuntu-trusty-64 ./confd[14378]: DEBUG " * Reloading haproxy haproxy\n   ...done.\n"

Configure the Docker Web Containers

Since in the end, our Containers will be running on a different host, HA-Proxy needs to know the HOST IP where the containers resides. If the host IP has a Public IP we can easily perform a lookup and get the public ip, in my current scenario, im running in a Vagrant VM, so i'll be passing the HOST IP via Docker ENV variable during startup. If we do a strict port mapping like 80:80 or 443:443, we wont be able to run more than one Web container in a single HOST. So if we go with random ports, our HA-Proxy needs to know these random ports. These port info will be stored in the ETCD key store, so ConfD can utilize the same.

let's start our First container,

  $ docker run --rm=true -it -p 9001:80 -p 8010:7373 -p 8011:5000 -h=web1 -e "FD_IP=192.168.33.102" -e "FD_PORT=9001" -e "SERF_RPC_PORT=8010" -e "SERF_PORT=8011" ubuntu:apache

Where port 9000 is mapped to containers port 80, and port 8010 and 8011 for the Serf Clients, FD_IP is the host IP

Now, lets setup the services for the container

  $ apt-get update && apt-get install apache2

  $ wget https://dl.bintray.com/mitchellh/serf/0.6.4_linux_amd64.zip

  $ unzip 0.6.4_linux_amd64.zip

Let's check the serf in foreground on a screen/tmux

$ ./serf agent -node=web2 -bind=0.0.0.0:5000 -rpc-addr=0.0.0.0:7373 -log-level=debug

If the Serf agent is starting fine, then lets go ahead and create an event handler for the event member-failed, which happens when a container is crashed. Below is a simple event handler script

#serf_handler.py
#!/usr/bin/python
import sys
import os
import requests
import json

base_url = "http://172.17.42.1:4001"   # Use any config mgmt tool or user-data script to populate this value for each host
base_key = "/v2/keys"
cluster_key = ["/proxy/frontend/", "/proxy/serf/", "/proxy/serf-rpc/"]    # EtcD keys that has to be removed

for line in sys.stdin:
    event_type = os.environ['SERF_EVENT']
    if event_type == 'member-failed':
        address = line.split('\t')[0]
        for key in cluster_key:
            full_url = base_url + base_key + key + address
            print full_url
            r = requests.get(full_url)
            if r.status_code == 200:
                r = requests.delete(full_url)
                print "Key Successfully removed from ETCD"
            if r.status_code == 404:
                print "Key already removed by another Serf Agent"

Also we need a bootstrapping script that can add the necessary keys onto ETCD, when the container has started for the first time. This script has to be invoked every during the starting of the container, so ConfD is aware of the new Containers that have started. Below is a simple shell script for the same,

#etcd_bootstrap.sh
HOST=`echo $HOSTNAME`
ETCD_VAL=`echo "$FD_IP:$FD_PORT"`
ETCD_SERF_PORT=`echo $FD_IP:$SERF_PORT`
ETCD_SERF_RPC_PORT=`echo $FD_IP:$SERF_RPC_PORT`
curl -s -X PUT "http://172.17.42.1:4001/v2/keys/proxy/frontend/$HOST" -d value=$ETCD_VAL > /dev/null
exit_stat=`echo $?`
if [ $exit_stat == 0 ]; then
  curl -s -X POST --data-urlencode 'payload={"channel": "#docker", "username": "etcdbot", "text": "`Host: '"$HOST"'`, FD key has been added successfully to ETCD cluster", "icon_emoji": ":ghost:"}' https://hooks.slack.com/services/xxxxxxx/yyyyyyy/xyxyxyxyxyxyxyxyxyxyxyx
else
  curl -s -X POST --data-urlencode 'payload={"channel": "#docker", "username": "etcdbot", "text": "`Host: '"$HOST"'`, Failed to add FD key to the ETCD cluster", "icon_emoji": ":ghost:"}' https://hooks.slack.com/services/xxxxxxx/yyyyyyy/xyxyxyxyxyxyxyxyxyxyxyx
fi
curl -s -X PUT "http://172.17.42.1:4001/v2/keys/proxy/serf/$HOST" -d value=$ETCD_SERF_PORT > /dev/null
exit_stat=`echo $?`
if [ $exit_stat == 0 ]; then
  curl -s -X POST --data-urlencode 'payload={"channel": "#docker", "username": "etcdbot", "text": "`Host: '"$HOST"'`, SERF PORT key has been added successfully to ETCD cluster", "icon_emoji": ":ghost:"}' https://hooks.slack.com/services/xxxxxxx/yyyyyyy/xyxyxyxyxyxyxyxyxyxyxyx
else
  curl -s -X POST --data-urlencode 'payload={"channel": "#docker", "username": "etcdbot", "text": "`Host: '"$HOST"'`, Failed to add SERF PORT key to the ETCD cluster", "icon_emoji": ":ghost:"}' https://hooks.slack.com/services/xxxxxxx/yyyyyyy/xyxyxyxyxyxyxyxyxyxyxyx
fi
curl -s -X PUT "http://172.17.42.1:4001/v2/keys/proxy/serf-rpc/$HOST" -d value=$ETCD_SERF_RPC_PORT > /dev/null
exit_stat=`echo $?`
if [ $exit_stat == 0 ]; then
  curl -s -X POST --data-urlencode 'payload={"channel": "#docker", "username": "etcdbot", "text": "`Host: '"$HOST"'`, SERF RPC PORT key has been added successfully to ETCD cluster", "icon_emoji": ":ghost:"}' https://hooks.slack.com/services/xxxxxxx/yyyyyyy/xyxyxyxyxyxyxyxyxyxyxyx
else
  curl -s -X POST --data-urlencode 'payload={"channel": "#docker", "username": "etcdbot", "text": "`Host: '"$HOST"'`, Failed to add SERF RPC PORT key to the ETCD cluster", "icon_emoji": ":ghost:"}' https://hooks.slack.com/services/xxxxxxx/yyyyyyy/xyxyxyxyxyxyxyxyxyxyxyx
fi

Now lets add the necessary keys to ETCD,

$ bash /opt/scripts/etcd_bootstrap.sh

Once the keys are added, lets start the serf agent wit the event handler script,

$ ./serf agent -node=web1 -bind=0.0.0.0:5000 -rpc-addr=0.0.0.0:7373 -log-level=debug -event-handler=/opt/scripts/serf_handler.py

# Script output
==> Starting Serf agent...
==> Starting Serf agent RPC...
== > Serf agent running!
      Node name: 'web1'
      Bind addr: '0.0.0.0:5000'
        RPC addr: '0.0.0.0:7373'
      Encrypted: false
        Snapshot: false
        Profile: lan

==> Log data will now stream in as it occurs:

  2015/09/07 07:37:12 [INFO] agent: Serf agent starting
  2015/09/07 07:37:12 [INFO] serf: EventMemberJoin: web1 172.17.0.20
  2015/09/07 07:37:13 [INFO] agent: Received event: member-join

Now we have a running Web Container, since we have added the keys to ETCD, lets verify that ConfD has detected the changes. Below is the logs from the ConfD,

2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Retrieving keys from store
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Key prefix set to /
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Using source template /etc/confd/templates/haproxy.cfg.tmpl
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Compiling source template /etc/confd/templates/haproxy.cfg.tmpl
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Comparing candidate config to /etc/haproxy/haproxy.cfg
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: INFO /etc/haproxy/haproxy.cfg has md5sum fe4fbaa3f5782c7738a365010a5f6d48 should be fadab72f5cef00c13855a27893d6e39c
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: INFO Target config /etc/haproxy/haproxy.cfg out of sync
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Overwriting target config /etc/haproxy/haproxy.cfg
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Running /etc/init.d/haproxy reload
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG " * Reloading haproxy haproxy\n   ...done.\n"
2015-09-07T07:40:17Z vagrant-ubuntu-trusty-64 ./confd[14970]: INFO Target config /etc/haproxy/haproxy.cfg has been updated

Also, lets confirm that the haproxy.cfg file has been updated,

# Updated part from the file
backend nodes
    mode http
    balance roundrobin
    option forwardfor

    server web1 192.168.33.102:9000 check    # ConfD has updated the server entry based on the key present in the EtcD

Now let's add the second Web container using the steps we followed for the first container, make sure that the ports are updated on the run script (For the second container, im using 9001 as the web port, 8020/8021 as the Serf ports). Once the bootstrapping scrip is executed successfully, lets verify if the haproxy.cfg file is update with the second entry.

backend nodes
    mode http
    balance roundrobin
    option forwardfor

    server web1 192.168.33.102:9000 check

    server web2 192.168.33.102:9001 check

For now, the serf agents on both the containers are running as a standalone service and they are not aware of each other. Lets execute serf join command to join the cluster. Below is a simple python script for the same.

# serf_starter.py
#!/usr/bin/python
import sys
import os
import etcd
import random
import subprocess

node_keys = []
client = etcd.Client(host='172.17.42.1', port=4001)

r = client.read('/proxy/serf-rpc', recursive = True)
for child in r.children:
    node_keys.append(child.key)
node_key = (random.choice(node_keys))   # Pick one server randomly
remote_serf = client.read(node_key)
print remote_serf.value
local_serf =  os.environ['FD_IP'] + ":" + os.environ['SERF_PORT']
serf_args = "-rpc-addr=%s %s" %(remote_serf.value, local_serf)
print subprocess.call(["/serf", "join", "-rpc-addr=%s" %remote_serf.value, local_serf])

Run the join script on the second container and check for the events on the first container's serf agent logs

$ /opt/scripts/serf_starter.py

Below are the event logs on the first container,

2015/09/07 07:53:02 [INFO] agent.ipc: Accepted client: 172.17.42.1:37326         # Connection from Web2
2015/09/07 07:53:02 [INFO] agent: joining: [192.168.33.102:8021] replay: false
2015/09/07 07:53:02 [DEBUG] memberlist: Initiating push/pull sync with: 192.168.33.102:8021
2015/09/07 07:53:02 [INFO] serf: EventMemberJoin: web2 172.17.0.35
2015/09/07 07:53:02 [INFO] agent: joined: 1 nodes
2015/09/07 07:53:02 [DEBUG] serf: messageJoinType: web2

Now both the agents are connected, lets verify the same

./serf members     # we can run this command on any one of the containers

web2  172.17.0.35:5000  alive
web1  172.17.0.20:5000  alive

Our Serf Cluster is up and running. Now lets terminate one container and see if Serf is able to detect the event. Once the serf detects the event it will call the event_handler script which should remove the keys from the etcD. In this case, i'm terminating the second container. We could see the event popping up in the first container's serf agent logs.

2015/09/07 07:57:37 [INFO] memberlist: Suspect web2 has failed, no acks received
2015/09/07 07:57:39 [INFO] memberlist: Suspect web2 has failed, no acks received
2015/09/07 07:57:41 [INFO] memberlist: Suspect web2 has failed, no acks received
2015/09/07 07:57:42 [INFO] memberlist: Suspect web2 has failed, no acks received
2015/09/07 07:57:42 [INFO] memberlist: Marking web2 as failed, suspect timeout reached
2015/09/07 07:57:42 [INFO] serf: EventMemberFailed: web2 172.17.0.35
2015/09/07 07:57:42 [INFO] serf: attempting reconnect to web2 172.17.0.35:5000
2015/09/07 07:57:43 [INFO] agent: Received event: member-failed
2015/09/07 07:57:43 [DEBUG] agent: Event 'member-failed' script output: http://172.17.42.1:4001/v2/keys/proxy/frontend/web2
Key Successfully removed from ETCD
http://172.17.42.1:4001/v2/keys/proxy/serf/web2
Key Successfully removed from ETCD
http://172.17.42.1:4001/v2/keys/proxy/serf-rpc/web2
Key Successfully removed from ETCD

As per the event_handler script output, the key has been successfully removed from the ETCD. Lets check the confd logs

# Confd logs
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Retrieving keys from store
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Key prefix set to /
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Using source template /etc/confd/templates/haproxy.cfg.tmpl
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Compiling source template /etc/confd/templates/haproxy.cfg.tmpl
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Comparing candidate config to /etc/haproxy/haproxy.cfg
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: INFO /etc/haproxy/haproxy.cfg has md5sum fadab72f5cef00c13855a27893d6e39c should be df9f09eb110366f2ecfa43964a3c862d
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: INFO Target config /etc/haproxy/haproxy.cfg out of sync
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Overwriting target config /etc/haproxy/haproxy.cfg
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG Running /etc/init.d/haproxy reload
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: DEBUG " * Reloading haproxy haproxy\n   ...done.\n"
2015-09-07T07:57:47Z vagrant-ubuntu-trusty-64 ./confd[14970]: INFO Target config /etc/haproxy/haproxy.cfg has been updated

# haproxy.cfg file

backend nodes
mode http
balance roundrobin
option forwardfor

server web1 192.168.33.102:9000 check

Bingo 🙂 , confd has detected the key removal and it has updated the haproxy.cfg file :). We can even remove the bootstrap script and can use Serf event handlers to populate the key when the member-join event is triggered

With ConfD, Serf, ETCD we can keep our service config files up to date with any human intervention. No matter if we scale up or scale down our Docker containers, our system will get updated automatically :). Put an option to log the events onto Slack and that's it, we wont miss any events and our whole team can keep track of what's happening under the hood

Standard
Ansible

Ping Your Ansible From Slack

We have been using Ansible for all our Deployments, Code Updates, Inventory Management. We have also built a higher level API for Ansible called bootstrapper. Bootstrapper provides a rest API for executing various complex tasks via Ansible. Also, there is a simple python CLI for bootstrapper called bootstrappercli which interacts with the bootstrapper servers in our prod/staging env and performs the tasks. We have been using slack for our team interactions and it became a part of my chrome tabs always. We also have a custom slack callback plugin, that provides us a real time callback for each step of playbook execution.

Since bootstrapper is providing a sweet API, we decided to make Slack to directly talk to Ansible,. After googling for some time, i came across a simple slackbot called limbo, written in Python(offcourse) and the best part is its plugin system. It’s super easy to write custom plugins. As an initial step, i wrote a couple of plugins for performing codepush, adhoc and EC2 Management. Below are some sample plugin codes,

Code-Update Plugin
"""Only direct messages are accepted, format is @user codeupdate <tag> <env>"""

import requests
import json
import unicodedata

ALLOWED_CHANNELS = []
ALLOWED_USERS = ['xxxxx']
ALLOWED_ENV = ['xxx', 'xxx']
ALLOWED_TAG = ['xxx', 'xxx']

BOOTSTRAPPER__PROD_URL = "http://xxx.xxx.xxx.xxx:yyyy"
BOOTSTRAPPER_STAGING_URL = "http://xxx.xxx.xxx.xxx:yyyy"

def get_url(infra_env):
    if infra_env == "prod":
        return BOOTSTRAPPER__PROD_URL
    else:
        return BOOTSTRAPPER_STAGING_URL

def codeupdate(ans_env, ans_tag):
    base_url = get_url(ans_env)
    req_url = base_url + '/ansible/update-code/'
    resp = requests.post(req_url, data={'tags': ans_tag, 'env': ans_env})
    return [resp.status_code, resp.text]

def on_message(msg, server):
    init_msg =  unicodedata.normalize('NFKD', msg['text']).encode('ascii','ignore').split(' ')[0]
    cmd = unicodedata.normalize('NFKD', msg['text']).encode('ascii','ignore').split(' ')[1]
    if cmd != 'codeupdate':
        return ""
    ### slight hackish method to check if the message is a direct message or not
    elif init_msg == '<@xxxxxx>:':    # Message is actually a direct message to the Bot
        orig_msg =  unicodedata.normalize('NFKD', msg['text']).encode('ascii','ignore').split(' ')[2:]
        msg_user = msg['user']
        msg_channel = msg['channel']
        if msg_user in ALLOWED_USERS and msg_channel in ALLOWED_CHANNELS:
            env = orig_msg[0]
            if env not in ALLOWED_ENV:
                return "Please Pass a Valid Env (xxxx/yyyy)"
            tag = orig_msg[1]
            if tag not in ALLOWED_TAG:
                return "Please Pass a Valid Tag (xxxx/yyyy)"
            return codeupdate(env, tag)
        else:
            return ("!!! You are not Authorized !!!")
    else:
        return "invalid Message Format, Format: <user> codeupdate <env> <tag>"
Adhoc Execution plugin
"""Only direct messages are accepted, format is <user> <env> <host> <mod> <args_if_any"""

import requests
import json
from random import shuffle
import unicodedata

    ALLOWED_CHANNELS = []
    ALLOWED_USERS = ['xxxxx']
    ALLOWED_ENV = ['xxx', 'xxx']
    ALLOWED_TAG = ['xxx', 'xxx']

    BOOTSTRAPPER__PROD_URL = "http://xxx.xxx.xxx.xxx:yyyy"
    BOOTSTRAPPER_STAGING_URL = "http://xxx.xxx.xxx.xxx:yyyy"

def get_url(infra_env):
    if infra_env == "prod":
        return BOOTSTRAPPER__PROD_URL
    else:
        return BOOTSTRAPPER_STAGING_URL

def adhoc(ans_host, ans_mod, ans_arg, ans_env):
    base_url = get_url(ans_env)
    req_url = base_url + '/ansible/adhoc/'
    if ans_arg is None:
        resp = requests.get(req_url, params={'host': ans_host, 'mod': ans_mod})
    else:
        resp = requests.get(req_url, params={'host': ans_host, 'mod': ans_mod, 'args': ans_arg})
    return resp.text

def on_message(msg, server):
### slight hackish method to check if the message is a direct message or not
    init_msg =  unicodedata.normalize('NFKD', msg['text']).encode('ascii','ignore').split(' ')[0]
    cmd = unicodedata.normalize('NFKD', msg['text']).encode('ascii','ignore').split(' ')[1]
    if cmd != 'adhoc':
        return ""
    elif init_msg == '<@xxxxxxx>:':    # Message is actually a direct message to the Bot
        orig_msg =  unicodedata.normalize('NFKD', msg['text']).encode('ascii','ignore').split(' ')[2:]
        msg_user = msg['user']
        msg_channel = msg['channel']
        if msg_user in ALLOWED_USERS and msg_channel in ALLOWED_CHANNELS:
            env = orig_msg[0]
            if env not in ALLOWED_ENV:
                return "Only Dev Environment is Supported Currently"
            host = orig_msg[1]
            mod = orig_msg[2]
            arg = orig_msg[3:]
            return adhoc(host, mod, arg, env)
        else:
            return ("!!! You are not Authorized !!!")
    else:
        return "Invalid Message, Format: <user> <env> <host> <mod> <args_if_any>"

Configure Limbo as per the Readme present in the Repo and start the limbo binary. Now let’s try executing some Ansible operations from Slack.

Let’s try a simple ping module

Let’s try updating a staging cluster,

WooHooo, it worked 😉 This is opening up a new dawn for us in managing our infrastructure from Slack. But offcourse ACL is an important factor, as we should restrict access to specific operations. Currently, this ACL logic is written with in the plugin itself. It checks the user who is executing the command and from which channel he is executing, if both matches only, the Bot will start the talking to Bootstrapper, or else it throws an error to the user. I’m pretty much excited to play with this, looking forward to automate Ansible tasks as much as possible from Slack 🙂

Standard
etcd, serf

Building Self Discovering Clusters With ETCD and SERF

Now a days its very difficult to contain all the services in a single box. Also, companies have started adopting micro service based architectures. For Docker user’s, they can achieve a highly scalable system like Amazon ASG using Apache mesos and their supporting frameworks like Marathon etc… But there are a lot services, where in a new node needs to know the existing members in a cluster to join successfully. Common method will be having a static ip’s to the machines. But we have started moving all our clusters slowly to become immutable via ASG. So either we need to go for Elastic IP’s, and use these static ip’s, but the next challenge would, what if we want to scale the cluster up. Also, if these instances are inside a private network, assigning static local ip’s is also difficult. So if a new machine is launched or if asg relaunches a faulty machine, these machine’s should know the existing cluster nodes. But again not all applications provide service-discovery features. I was testing around with a new Erlang based MQTT broker called verne. For a new Verne node to join the cluster, we need to pass the ip’s of all existing nodes to the JOIN command.

So a quick solution came to mind was using a Distributed K-V pair system like etcd. Idea was to use etcd to have a key-value that can tell us the nodes present in a cluster. But quickly i found another hurdle. Imagine a node belonging to a cluster, went down, the key will be still present in etcd. If we have something like ASG, will launch a new machine, so this new machine’s IP will not be present in the etcd key. We can delete manaully and launch a new machine, then add its IP to the etcd. But our ultimate goal is to automate all these, otherwise our clusters cannot become a full immutable and auto healing.

So we need a system, that can keep track of our Node’s activity atleast, when they join/leave the cluster.

SERF

Enter SERF. Serf is a decentralized solution for cluster membership, failure detection, and orchestration. Serf relies on an efficient and lightweight gossip protocol to communicate with nodes. The Serf agents periodically exchange messages with each other in much the same way that a zombie apocalypse would occur: it starts with one zombie but soon infects everyone. Serf is able to quickly detect failed members and notify the rest of the cluster. This failure detection is built into the heart of the gossip protocol used by Serf.

Serf, also need to know IP’s of other Serf Agents in the cluster. So initially when we start the cluster, we know the IP’s of the other Serf agents, and the serf agents will then populate our Cluster Key’s, on ETCD. Now if a new machine is launched/relaunched, etcd will use the Discovery URL and will connect to the ETCD Cluster. Once it has joined the existing cluter, our Serf agent will start and it will talk to the etcd agent running locally and will create the necessary key for the node. Once the Key is being created, Serf Agent will join the serf cluster using one of the existing nodes IP that was present in the ETCD cluster.

ETCD Discovery URL

A bit about etcd Discovery URL. It’s an inbuild discovery method in etcd for identifying the dynamic members in an etcd cluster. First, we need an ETCD server, this server won’t be a part of our cluster, instead this is used only for providing the discovery URL. We can use this server for multiple ETCD clusters, provided each cluster should have a unique URL. So all the new nodes can talk to their unique url and can join with the rest of the etcd nodes.

# For starting the ETCD Discovery Server

$ ./etcd -name test --listen-peer-urls "http://172.16.16.99:2380,http://172.16.16.99:7001" --listen-client-urls "http://172.16.16.99:2379,http://172.16.16.99:4001" --advertise-client-urls "http://172.16.16.99:2379,http://172.16.16.99:4001" --initial-advertise-peer-urls "http://172.16.16.99:2380,http://172.16.16.99:7001" --initial-cluster 'test=http://172.16.16.99:2380,test=http://172.16.16.99:7001'

Now we need to create a unique Discovery URL for our clusters.

$ curl -X PUT 172.16.16.99:4001/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/_config/size -d value=3

 where, 6c007a14875d53d9bf0ef5a6fc0257c817f0fb83 => a unique UUID
        value => Size of the cluster, if the number of etcd nodes that tries to join with the specific url is > value, then the extra etcd processes will fall back to being proxies by default.

SERF Events

Serf comes with inbuilt event handler system. It has some inbuilt event handlers for Member Events like member-join, member-leave. We are going to concentrate on these two events. Below is a simple handler script, that will add/remove keys in ETCD, whenever a node joins/leave the cluster.

# handler.py
#!/usr/bin/python
import sys
import os
import requests
import json

base_url = "http://<ip-of-the-node>:4001"   # Use any config mgmt tool or user-data script to populate this value for each host
base_key = "/v2/keys"
cluster_key = "/cluster/<cluster-name>/"    # Use any config mgmt tool or user-data script to populate this value for each host

for line in sys.stdin:
    event_type = os.environ['SERF_EVENT']
    if event_type == 'member-join':
        address = line.split('\t')[1]
        full_url = base_url + base_key + cluster_key + address
        r = requests.get(full_url)
        if r.status_code == 404:
            r = requests.put(full_url, data="member")     # If the key doesn't exists, it will create a new key
            if r.status_code == 201:
                print "Key Successfully created in ETCD"
            else:
                print "Failed to create the Key in ETCD"
        else:
            print "Key Already exists in ETCD"
    if event_type == 'member-leave':
        address = line.split('\t')[1]
        full_url = base_url + base_key + cluster_key + address
        r = requests.get(full_url)
        if r.status_code == 200:                         # If the node leaves the cluster and the key still exists, remove it
            r = requests.delete(full_url)
        if r.status_code == 404:            
                print "Key already removed by another Serf Agent"
            else:
                print "Key Successfully removed from ETCD"
        else:
            print "Key already removed from ETCD"

Design

Now we have some vague idea of serf and etcd. Let’s start building our self-discovering cluster. Make sure that the etcd Discovery node is up and a unique URL has been generated for the same. Once the URL is ready, lets start our ETCD cluster.

# Node1

$ /etcd -name test1 --listen-peer-urls "http://172.16.16.102:2380,http://172.16.16.102:7001" --listen-client-urls "http://172.16.16.102:2379,http://172.16.16.102:4001" --advertise-client-urls "http://172.16.16.102:2379,http://172.16.16.102:4001" --initial-advertise-peer-urls "http://172.16.16.102:2380,http://172.16.16.102:7001" --discovery http://172.16.16.99:4001/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/

# Sample Output

2015/06/01 19:51:10 etcd: listening for peers on http://172.16.16.102:2380
2015/06/01 19:51:10 etcd: listening for peers on http://172.16.16.102:7001
2015/06/01 19:51:10 etcd: listening for client requests on http://172.16.16.102:2379
2015/06/01 19:51:10 etcd: listening for client requests on http://172.16.16.102:4001
2015/06/01 19:51:10 etcdserver: datadir is valid for the 2.0.1 format
2015/06/01 19:51:10 discovery: found self a9e216fb7b8f3283 in the cluster
2015/06/01 19:51:10 discovery: found 1 peer(s), waiting for 2 more      # Since our cluster Size is 3, and this is the first node, its waiting for other 2 nodes to join

Lets start the other two nodes,

./etcd -name test2 --listen-peer-urls "http://172.16.16.101:2380,http://172.16.16.101:7001" --listen-client-urls "http://172.16.16.101:2379,http://172.16.16.101:4001" --advertise-client-urls "http://172.16.16.101:2379,http://172.16.16.101:4001" --initial-advertise-peer-urls "http://172.16.16.101:2380,http://172.16.16.101:7001" --discovery http://172.16.16.99:4001/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/

./etcd -name test3 --listen-peer-urls "http://172.16.16.100:2380,http://172.16.16.100:7001" --listen-client-urls "http://172.16.16.100:2379,http://172.16.16.100:4001" --advertise-client-urls "http://172.16.16.100:2379,http://172.16.16.100:4001" --initial-advertise-peer-urls "http://172.16.16.100:2380,http://172.16.16.100:7001" --discovery http://172.16.16.99:4001/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/

Now we have the ETCD cluster up and running, lets configure the serf.

$ ./serf agent -node=test1 -bind=172.16.16.102:5000 -rpc-addr=172.16.16.102:7373 -log-level=debug -event-handler=/root/handler.py

$ ./serf agent -node=test2 -bind=172.16.16.101:5000 -rpc-addr=172.16.16.101:7373 -log-level=debug -event-handler=/root/handler.py

$ ./serf agent -node=test3 -bind=172.16.16.100:5000 -rpc-addr=172.16.16.100:7373 -log-level=debug -event-handler=/root/handler.py

Now these 3 serf agents are running in standalone mode and they are not aware of each other. Lets make the agents to join and form the serf cluster.

$ ./serf join -rpc-addr=172.16.16.100:7373 172.16.16.101:5000

$ ./serf join -rpc-addr=172.16.16.102:7373 172.16.16.100:5000

Let’s query the serf agents and see if the cluster has been created.

$ root@sentinel:~# ./serf members -rpc-addr=172.16.16.100:7373
  test3  172.16.16.100:5000  alive
  test2  172.16.16.101:5000  alive
  test1  172.16.16.102:5000  alive

    $ root@sentinel:~# ./serf members -rpc-addr=172.16.16.101:7373
      test3  172.16.16.100:5000  alive
      test2  172.16.16.101:5000  alive
      test1  172.16.16.102:5000  alive

    $ root@sentinel:~# ./serf members -rpc-addr=172.16.16.102:7373
      test3  172.16.16.100:5000  alive
      test2  172.16.16.101:5000  alive
      test1  172.16.16.102:5000  alive

Now our serf nodes are up, let’s check our etcd cluster to see if serf has created the keys.

$ ./etcdctl -C 172.16.16.100:4001 ls /cluster/test/
  /cluster/test/172.16.16.100
  /cluster/test/172.16.16.101
  /cluster/test/172.16.16.102

Serf has successfully created the key. Now, let’s terminate one of the node’s. one of our Serf agent will immidiately receives an EVENT for member-leave and it will start the handler. Below is the log for the First agent who received the event.

2015/06/01 22:56:43 [INFO] serf: EventMemberLeave: test1 172.16.16.102
2015/06/01 22:56:44 [INFO] agent: Received event: member-leave
2015/06/01 22:56:45 [DEBUG] agent: Event 'member-leave' script output: member-leave => 172.16.16.102
Key Successfully removed from ETCD                                
2015/06/01 22:56:45 [DEBUG] memberlist: Initiating push/pull sync with: 172.16.16.100:5000

This event will later gets passed to the other agent,

2015/06/01 22:56:43 [INFO] serf: EventMemberLeave: test1 172.16.16.102
2015/06/01 22:56:44 [INFO] agent: Received event: member-leave
2015/06/01 22:56:45 [DEBUG] agent: Event 'member-leave' script output: member-leave => 172.16.16.102
Key already removed by another Serf Agent

Let’s make sure that the KEY was successfully removed from ETCD.

$ ./etcdctl -C 172.16.16.100:4001 ls /cluster/test/
  /cluster/test/172.16.16.100
  /cluster/test/172.16.16.101

Now let’s bring the old node back online,

$ ./serf agent -node=test1 -bind=172.16.16.102:5000 -rpc-addr=172.16.16.102:7373 -log-level=debug -event-handler=/root/handler.py
  ==> Starting Serf agent...
  ==> Starting Serf agent RPC...
  ==> Serf agent running!
           Node name: 'test1'
           Bind addr: '172.16.16.102:5000'
            RPC addr: '172.16.16.102:7373'
           Encrypted: false
            Snapshot: false
             Profile: lan

  ==> Log data will now stream in as it occurs:

      2015/06/01 23:01:55 [INFO] agent: Serf agent starting
      2015/06/01 23:01:55 [INFO] serf: EventMemberJoin: test1 172.16.16.102
      2015/06/01 23:01:56 [INFO] agent: Received event: member-join
      2015/06/01 23:01:56 [DEBUG] agent: Event 'member-join' script output: member-join => 172.16.16.102
  Key Successfully created in ETCD

The standalone agent has already created the key. Now let’s make the Serf agent to join the cluster. It can use any one of the IP from the K-V apart from it’s own IP

$ ./serf join -rpc-addr=172.16.16.102:7373 172.16.16.100:5000
  Successfully joined cluster by contacting 1 nodes.

Once the node joins the cluster, the other existing serf nodes will receive the member-join event, and will also tries to execute the handler. But since the key is already being created, it wont recereate it.

    2015/06/01 23:04:56 [INFO] agent: Received event: member-join
    2015/06/01 23:04:57 [DEBUG] agent: Event 'member-join' script output: member-join => 172.16.16.102
Key Already exists in ETCD

Let’s check our ETCD cluster to see if the keys are recreated.

$ ./etcdctl -C 172.16.16.100:4001 ls /cluster/test/
  /cluster/test/172.16.16.100
  /cluster/test/172.16.16.101
  /cluster/test/172.16.16.102

So now we have IP’s of our active nodes in the cluster in etcd. Now we need to extract the IP’S from etcd and need to use them with our Applications. Below is a simple python script that returns the IP’s in as a python LIST.

import requests
import json

base_url = "http://<local_ip_of_server>:4001"
base_key = "/v2/keys"
cluster_key = "/cluster/<cluster_name>/"
full_url = base_url + base_key + cluster_key
r = requests.get(full_url)
json_data = json.loads(r.text)
node_list = json_data['node']['nodes']
nodes = []
for i in node_list:
   nodes.append(i['key'].split(cluster_key)[1])
print nodes

We can use ETCD and confd to generate the config’s dynamically for our applications.

Conclusion

ETCD and SERF provide a powerfull combination for achieving a self-discovery feature onto our clusters. But it still needs heavy QA testing before it’s being used in production. Especially, we need to test on the confd automation part more, as the we are not using a single KEY-VALUE, as the SERF creates a separate K-V for each node. And the IP discovery from ETCD is done by iterating the key-directory. But these tools are backed by an awesome community, so i’m sure soon these tools will helps us to achieve a whole new level for clustering our applications.

Standard
Ansible

Building Auto Healing Clusters With AWS and Ansible

As we all know this is the era of cloud servers. With the emergence of cloud, no need to worry about the difficulties in hosting servers on premises. But if you are cloud engineer, you definitely know that any thing can happen to your machine. Unlike going and fixing on our own, in cloud its difficult. Even i faced a lot of such weird issues, where my cloud service provider terminated my server’s which includes my Postgres DB master also. So having a self healing cluster will help us a lot, especially if the server goes down in the middle of our sleep. Stateless services are the easiest candidates for self healing compared to DB’s, especially if we are using a Master-Slave DB architecture.

For those who are using Docker and Mesos, Marathon provides similar scaling features like Amazon ASG. We define the number of instances that has to be running, and Marathon makes sure that number always exists. Like Amazon ASG, it will relaunch a new container, if any container accidentally terminates. I’ve personally tested this feature of Marathon long back, and it’s really a promising one. There are indeed other automated container management systems, but the marathon is quite flexible to me.

But in Clementine, in our current architecture, we are not yet using Docker in Production and we heavily use AWS for all our clusters. With more features like Secure messaging, VOIP etc.. added to our product, we are expanding tremendously. And so does our infrastructure. Being a DevOps engineer, i need to keep the uptime. So this time i decided to prototype a self healing cluster using Amazon ASG and Ansible.

Design

For the auto healing i’m going to use Amazon ASG and Ansible. Since Ansible is a client less application, we need to either use Ansible in stand-alone mode and provision the machine via cloud init script, or use the ansible-pull. Or as the company recommends, use Ansible Tower, which is a paid solution. But we have built our own higher level API solution over Ansible called bootstrapper. Bootstrapper exposes a higher level rest API which we can invoke for all our Ansible management. Our in house version of Bootstrapper can perform various actions like, ec2 instance launch with/without EIP, Ahdoc command execution, server bootstrapping, code update etc ….

But again, if we use a plain AMI and tries to bootstrap the server completely during startup, it puts a heavy delay, especially when pypi gives u time out while installing the pip packages. So we decided to use a custom AMI which has our latest build in it. Jenkins takes care of this part. Our build flow is like this,

Dev pushes code to Master/Dev => Jenkins performs build test => if build succeeds, starts our master/dev packages => uploads the package to our APT repo => Packer builds the latest image via packer-aws-chroot

While building the image, we add two custom scripts on to our images, 1) setup_eip.sh (manages EIP for the instance via Bootstrapper), 2) ans_bootstrap.sh (Manages server bootstrapping via Ansible)

# set_eip.sh
inst_id=`curl -s http://169.254.169.254/latest/meta-data/instance-id`
role=`cat /etc/ansible/facts.d/clem.fact  | grep role | cut -d '=' -f2`  # our custom ansible facts
env=`cat /etc/ansible/facts.d/clem.fact  | grep env | cut -d '=' -f2`    # our custom ansible facts
if [ $env == "staging" ]; then
  bootstrapper_url="xxx.xxx.xxx.xxx:yyyy/ansible/set_eip/"
else
  bootstrapper_url="xxx.xxx.xxx.xxx:yyyy/ansible/set_eip/"     # our bootstrapper api for EIP management
fi
curl -X POST -s -d "instance_id=$inst_id&role=$role&env=$env" $bootstrapper_url

The above POST request to Ansible performs EIP management and Ansible will assign the proper EIP to the machine without any collision. We keep an EIP mapping for our cluster, which makes sure that we are not assigning any wrong EIP to the machines. If no EIP is available, we raise an exception and email/slack the infra team about the instance and cluster

# ans_bootstrap.sh
local_ip=`curl -s http://169.254.169.254/latest/meta-data/local-ipv4`
    role=`cat /etc/ansible/facts.d/clem.fact  | grep role | cut -d '=' -f2`  # our custom ansible facts
    env=`cat /etc/ansible/facts.d/clem.fact  | grep env | cut -d '=' -f2`    # our custom ansible facts
    if [ $env == "staging" ]; then
      bootstrapper_url="xxx.xxx.xxx.xxx:yyyy/ansible/role/"
    else
      bootstrapper_url="xxx.xxx.xxx.xxx:yyyy/ansible/role/"     # our bootstrapper api for role based playbook execution, multiple roles can be passed to the API
    fi
curl -X POST -d "host=$local_ip&role=$role&env=$env" $bootstrapper_url

These two scripts are executed via cloud-init script during machine bootup. Once we have the image ready, we need to create a launch config for the ASG. Below is a sample userdata script,

#! /bin/bash

echo "Starting EIP management via Bootstrapper"
/usr/local/src/boostrap_scripts/set_eip.sh
echo "starting server bootstrap"
/usr/local/src/bootstrap_scripts/ans_bootstrap.sh

Now create an Autoscaling group with the required number of nodes. On the scaling policies, select Keep this group at its initial size. Once the ASG is up, it will start the nodes based on the AMI and Subnet mentioned. Once the machine starts booting, cloud-init script will start executing our userdata scripts, which in turn talks to our Bootstrapper-Ansible and starts assigning EIP and executing the playbooks onto the hosts. Below is a sample log on our bootstrapper for EIP management, invoked by an ASG node while it was booting up.

01:33:11 default: bootstrap.ansble_set_eip(u'staging', u'<ansible_role>', u'i-xxxxx', '<remote_user>', '<remote_key>') (4df8aee9-ab0e-4152-973a-b227ddac91a1)
EIP xxx.xxx.xxx.xxx is attached to instance i-xyxyxyxy
EIP xxx.xxx.xxx.xxx is attached to instance i-xyxyxyxy
EIP xxx.xxx.xxx.xxx is attached to instance i-xyxyxyxy
EIP xxx.xxx.xxx.xxx is attached to instance i-xyxyxyxy
EIP xxx.xxx.xxx.xxx is attached to instance i-xyxyxyxy
Free EIP available for <our-cluster-name> is xxx.xxx.xxx.xxx   # Bootstrapper found the free ip that is allowed to be assigned for this particular node

PLAY [localhost] **************************************************************

TASK: [adding EIP to the instance] ********************************************
changed: [127.0.0.1]
01:33:12 Job OK, result = {'127.0.0.1': {'unreachable': 0, 'skipped': 0, 'ok': 2, 'changed': 1, 'failures': 0}}

I’ve tested this prototype with one of our VOIP clusters and the cluster is working is perfectly with the corresponding EIP’s as mapped. We terminated the machines, multiple times, to make sure that the EIP management is working properly and the servers are getting bootstrapped. The results are promising and this now motivates us to migrate all of our stateless clusters onto self healing so that our cluster auto heals whenever a machine becomes unhealthy. No need of any Human intervention unless Amazon really screws their ASG :p

Standard
Ansible, Redis

Building an Automated Config Management Server using Ansible+Flask+Redis

It’s almost 2 months since i’ve started playing full time on ansible. Like most of the SYS-Admin’s, ive been using ansible via cli most of the time. Unlike Salt/Puppet, ansible is an agent less one. So we need to invoke things from the box which contains ansible and the respective playbooks installed. Also, if you want to use ansible with ec2 features like auto-scaling, we need to either buy Ansible Tower, or need to use ansible-fetch along with the userdata script. I’ve also seen people, who uses custom scripts, that fetches their repo and execute ansible playbook locally to bootstrap.

Being a good fan of Flask, i’ve used flask on creating many backend API’s to automate a bunch of my tasks. So this time i decided to write a simple Flask API for executing Ansible playbook/ Ansible Adhoc commands etc.. Ansible also provides a Python API, which also made my work easier. Like most of the Ansible user’s, i use Role’s for all my playbooks. We can directly expose an API to ansible and can execute playbooks. But there are cases, where the playbook execution takes > 5min, and offcourse if there is any network latency it will affect our package download etc. I don’t want to force my HTTP clients to wait for the final output of the playbook execution to get a response back.

So i decided to go ahead with a JOB Queue feature. Each time a request comes to my API, the job is queued in Redis and the JOB ID will be returned to the clients. Then my job-workers pick the job’s from the redis queue and performs the job execution on the backend and workers will keep updating the job status. So now, i need to expose 2 API’s first, ie, one for receiving jobs and one for job status. For Redis Queue, there is an awesome library called rq. I’ve been using rq for all queuing tasks.

Flask API

The JOB accepts a bunch of parameters like host, role, env via HTTP POST method. Since the role/host etc.. have to be retrieved from the HTTP request, my playbook yml file has to be a dynamic one. So i’ve decided to use Jinja templating to dynamically create my playbook yml file. Below is my sample API for Role based playbook execution.

@app.route('/ansible/role/', methods=['POST'])
def role():
  inst_ip = request.form['host']                          # Host to which the playbook has to be executed
  inst_role = request.form['role']                        # Role to be applied on the Playbook
  env = request.form['env']               # Extra evn variables to be passed while executing the playbook
  ans_remote_user = "ubuntu"                  # Default remote user
  ans_private_key = "/home/ubuntu/.ssh/id_rsa"        # Default ssh private key
  job = q.enqueue_call(                   # Queuing the job on to Redis
            func=ansble_run, args=(inst_ip, inst_role, env, ans_remote_user, ans_private_key,), result_ttl=5000, timeout=2000
        )
  return job.get_id()                     # Returns job id if the job is successfully queued to Redis

Below is a sample templating function that generates the playbook yml file via Jinja2 templating

def gen_pbook_yml(ip, role):
  r_text = ''
  templateLoader = jinja2.FileSystemLoader( searchpath="/" )
  templateEnv = jinja2.Environment( loader=templateLoader )
  TEMPLATE_FILE = "/opt/ansible/playbook.jinja"                # Jinja template file location
  template = templateEnv.get_template( TEMPLATE_FILE )
  role = role.split(',')                       # Make Role as an array if Multiple Roles are mentioned in the POST request
  r_text = ''.join([random.choice(string.ascii_letters + string.digits) for n in xrange(32)])  
  temp_file = "/tmp/" + "ans-" + r_text + ".yml"           # Crating a unique playbook yml file
  templateVars = { "hst": ip,
                   "roles": role
                 }
  outputText = template.render( templateVars )             # Rendering template
  text_file = open(temp_file, "w")
  text_file.write(outputText)                      # Saving the template output to the temp file
  text_file.close()
  return temp_file

Once the playbook file is ready, we need to invoke Ansible’s API to perform our bootstrapping. This is actually done by the Job workers. Below is a sample function which invokes the playbook API from Ansible CORE.

def ansble_run(ans_inst_ip, ans_inst_role, ans_env, ans_user, ans_key_file):
  yml_pbook = gen_pbook_yml(ans_inst_ip, ans_inst_role)   # Generating the playbook yml file
  run_pbook = ansible.playbook.PlayBook(          # Invoking Ansible's playbook API
                 playbook=yml_pbook,
                 callbacks=playbook_cb,
                 runner_callbacks=runner_cb,
                 stats=stats,
                 remote_user=ans_user,
                 private_key_file=ans_key_file,
                 host_list="/etc/ansible/hosts",          # use either host_file or inventory
#                Inventory='path/to/inventory/file',
                 extra_vars={
                    'env': ans_env
                 }
                 ).run()
  return run_pbook                    # We can tune the output that has to be returned

Now the job-workers executes and updates the status on the Redis. Now we need to expose our JOB status API. Below is a sample Flask API for the same.

@app.route("/ansible/results/<job_key>", methods=['GET'])
def get_results(job_key):

    job = Job.fetch(job_key, connection=conn)
    if job.is_finished:
        ret = job.return_value
    elif job.is_queued:
        ret = {'status':'in-queue'}
    elif job.is_started:
        ret = {'status':'waiting'}
    elif job.is_failed:
        ret = {'status': 'failed'}

    return json.dumps(ret), 200

Now, we have a fully fledged API server for executing Role based playbooks. This API can also be used with user data scripts in autoscaling, where in we need to perform an HTTP POST request to the API server, and our API server will start the Bootstrapping. I’ve tested this app locally with various scenarios and the results are promising. Now as a next step, i’m planning to extend the API to do more jobs like, automating Code Pushes, Running AD-Hoc commands via API etc… With applications like Ansible, Redis, Flask, i’m sure SYS Admins can attain the DevOps Nirvana :). I’ll be pushing the latest working code to my Github account soon…

Standard