Docker, Marathon, Mesos

Running Mesos With Native Docker Support

In my previous blog, i’ve explained how to manage Docker cluster using Mesos and Marathon. Now Mesos has released 0.20 with Native Docker support. Till Mesos 0.19, we had to use mesosphere’s Deimos for running Docker containers on Mesos slaves. Now from Mesos 0.20, Mesos has the ability to run Docker containers directly. As per Mesos Documentation we can run containers in two ways. 1) as a Task and 2) as an Executor. Currently Mesos 0.20 supports only the host (–net=host) Docker networking mode.

Install the latest Mesos 0.20 using the mesosphere’s Mesos Debian package. Once we have setup the Zookeeper and Docker, we need to make a few changes to enable the Mesos Native Docker support. We need to start all the Mesos-Slave with --containerizers=docker,mesos flag. For enabling this flag, we need to create a file containerizers in /etc/mesos-slave/containerizers with content “docker,mesos”. Mesos slave process, when starting up, will read the folder and enable this flag.

$ echo 'docker,mesos' > /etc/mesos-slave/containerizers

$ service mesos-slave restart

I was trying to start a container with Port 9999 via Marathon REST API. But the task was keep on failing. Up on investigating the logs, i found that the slave when sends the allocatable resource details to mesos master was total allocatable: cpus():0.8; mem():801; disk():35164; ports():[31000-31099, 31101-32000]. So the allowed port range was [31000-31099, 31101-32000]. So if we wnat to use any other custom port, we need to define the same in the /etc/mesos-slave folder by creating a config file.

    $ echo "ports(*):[31000-31099, 31101-32000, 9998-9999]" > "/etc/mesos-slave/resources"

Now, my new allocatable resource details sent to the Mesos master became total allocatable: ports():[31000-31099, 31101-32000, 9998-9998]; cpus():0.7; mem():801; disk():35164. The above config is crucial if we want to allocate our own custom range of ports. So this gave a good idea on how to control my resource allocation by creating custom configuration files. Now let’s restart Mesos-slave process.

    $ service mesos-slave restart

    $ ps axf | grep mesos-slave | grep -v grep
     1889 ?        Ssl    0:26 /usr/local/sbin/mesos-slave --master=zk://localhost:2181,localhost:2182,localhost:2183/mesos --log_dir=/var/log/mesos --containerizers=docker,mesos --resources=ports(*):[31000-31099, 31101-32000, 9998-9999]
     1900 ?        S      0:00  \_ logger -p user.info -t mesos-slave[1889]
     1901 ?        S      0:00  \_ logger -p user.err -t mesos-slave[1889]

Now we have the Mesos slave running with Native Docker support. But the current stable version of Marathon still don’t support the native Docker feature of Mesos. So we need to setup Marathon 0.7 from scratch. I’ve written a blog on upgrading Marathon from 0.6.x to 0.7. Once we have setup the Marathon 0.7, we can start using Mesos with the Native Docker support. Let’s create a new App in Marathon via its REST API.

create a JSON file in the new container format. say ubuntu.json

{
    "id": "jackex",
    "container": {
        "docker": {
            "image": "ubuntu:14.04"
        },
        "type": "DOCKER",
        "volumes": []
    },
    "cmd": "while sleep 10; do date -u +%T; done",
    "cpus": 0.2,
    "mem": 200,
    "ports": [9999],
    "requirePorts": true,
    "instances": 1
}

Now, we can use the Marathon API to launch the Docker task,

$ curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d@ubuntu.json

{"id":"/jackex","cmd":"while sleep 10; do date -u +%T; done","args":null,"user":null,"env":{},"instances":1,"cpus":0.2,"mem":200.0,"disk":0.0,"executor":"","constraints":[],"uris":[],"storeUrls":[],"ports":[9999],"requirePorts":true,"backoffSeconds":1,"backoffFactor":1.15,"container":{"type":"DOCKER","volumes":[],"docker":{"image":"ubuntu:14.04"}},"healthChecks":[],"dependencies":[],"upgradeStrategy":{"minimumHealthCapacity":1.0},"version":"2014-08-31T16:03:50.593Z"}

$ docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
25ba3950a9e1        ubuntu:14.04        /bin/sh -c 'while sl   2 minutes ago       Up 2 minutes                            mesos-a19af637-ffb9-4d3c-8b61-778d47087ace

Mesos UI

Marathon UI

With the new Native Docker support, Mesos is becoming more friendly with Docker. And this is a great achievement for people like me who are trying to use Mesos and Docker in our Infrastructure. A big kudos to Mesos and Mesosphere team for their great work 🙂

Standard
Docker, Marathon, Mesos

Upgrading Marthon for Mesos Native Docker Support

A few days ago Mesos 0.20 version was released with native Docker support. Till that, Mesos Docker integration was performed using Mesosphere’s Deimos application. But with the new Mesos, there is no need for any external application for launching Docker containers in Mesos Slaves. As per Mesos Documentation we can run containers in two ways. 1) as a Task and 2) as an Executor. Currently Mesos 0.20 supports only the host (--net=host) Docker networking mode.

The current stable release of Marathon is 0.6X. The debian package provided by the Mesosphere also provides the 0.6x version. The container format has been completely changed in 0.7.x version. So we cannot use the 0.6.x version along with Mesos 0.20. More details are available in the Marathon Documentation

The current Mesos Master branch is version := "0.7.0-SNAPSHOT". So we need to compile Marathon from scratch. The only dependency for compiling Marathon is scala-sbt. Get the latest Debian package for sbt from here. The current sbt version is 0.13.5.

$ wget http://dl.bintray.com/sbt/debian/sbt-0.13.5.deb

$ dpkg -i sbt-0.13.5.deb

Now let’s clone the Marathon githib repo.

$ git clone https://github.com/mesosphere/marathon.git && cd marathon

$ sbt assembly

$ ./bin/build-distribution  # for building jar file

The above build command will create an executable jar file “marathon-runnable.jar” under the “target” folder. We can use this jar file along with Marathon 0.6.x start script. If we check the upstart script of Marathon 0.6.x, it uses /usr/local/bin/marathon binary for starting the service. So we need to edit two lines in this file to use our latest version 0.7’s jar file.

First let’s move our jar file to say “/opt/ folder.

$ cp marathon-runable.jar /opt/marathon.jar

Now let’s edit the Marathon binary. Make the below changes in the binary file,

marathon_jar="/opt/marathon.jar" # Line number 21

and

exec java "${vm_opts[@]}" -jar "$marathon_jar" "$@"  # -jar option added to use the jar file instead if the default -cp option. Line number 62

Now let’s restart the Marathon service.

$ service marathon restart

Now let’s use the ps command and verify the service status.

$ ps axf | grep marathon | grep -v grep

22653 ?        Ssl    6:09 java -Xmx512m -Djava.library.path=/usr/local/lib -Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n -jar /opt/marathon.jar --zk zk://localhost:2181,localhost:2182,localhost:2183/marathon --master zk://localhost:2181,localhost:2182,localhost:2183/mesos
22663 ?        S      0:00  \_ logger -p user.info -t marathon[22653]
22664 ?        S      0:00  \_ logger -p user.notice -t marathon[22653]

Now Marathon is running with the version 0.7 jar file. Let’s verify by creating a new Docker task. Make sure that the Mesos version 0.20 is running on the host and not the older versions of Mesos. More detail about the new container format is available in the Marathon Documentation page.

create a JSON file in the new container format. say ubuntu.json

{
    "id": "mesos-docker-test",
    "container": {
        "docker": {
            "image": "ubuntu:14.04"
        },
        "type": "DOCKER",
        "volumes": []
    },
    "cmd": "while sleep 10; do date -u +%T; done",
    "cpus": 0.2,
    "mem": 200,
    "ports": [9999],
    "requirePorts": false,
    "instances": 1
}

Now, we can use the Marathon API to launch the Docker task,

$ curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d@ubuntu.json

{"id":"/mesos-docker-test","cmd":"while sleep 10; do date -u +%T; done","args":null,"user":null,"env":{},"instances":1,"cpus":0.2,"mem":200.0,"disk":0.0,"executor":"","constraints":[],"uris":[],"storeUrls":[],"ports":[9999],"requirePorts":false,"backoffSeconds":1,"backoffFactor":1.15,"container":{"type":"DOCKER","volumes":[],"docker":{"image":"ubuntu:14.04"}},"healthChecks":[],"dependencies":[],"upgradeStrategy":{"minimumHealthCapacity":1.0},"version":"2014-08-31T16:03:50.593Z"}

$ docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
25ba3950a9e1        ubuntu:14.04        /bin/sh -c 'while sl   2 minutes ago       Up 2 minutes                            mesos-a19af637-ffb9-4d3c-8b61-778d47087ace

Here i’ve used Port number 9999. But by default, in Mesos 0.20, default port range resourced by Mesos Slave is [31000-32000]. Please refer my next Blog post on how to modify and set a custom port range.

With the new Native Docker support, Mesos is becoming more friendly with Docker. And this is a great achievement for people like me who are trying to use Mesos and Docker in our Infrastructure. A big kudos to Mesos and Mesosphere team for their great work 🙂

Standard
Marathon, Mesos

Marathon Event Bus

It’s almost 3 months since i’ve started with Mesos + Marathon. Yesterday, i was going through the Marathon Doc Site, and found that Marathon has a cool internal Event Bus. Currently Marathon has an ibuilt HTTP callback subscriber that POSTs events in JSON format to one or more endpoints. So i decided to give it a try to my Mesos test environment. More documentations of the Event Bus are available in the Documentation page.

Currently, in my test environment all the Mesos and Marathon are installed from the Mesosphere Debian packages. Whether we use the packaged init or upstart script, both of them are directly calling the /usr/local/bin/marathon binary. For the Event Bus with http callback, we need to enable two flags (–event_subscriber http_callback –http_endpoints http://host1/foo,http://host2/bar) while starting the Marathon service ie, –event_subscriber, the type of subscriber that we are going to use and –http_endpoints, endpoints corresponding to the subscriber.

From the marathon binary, i found the it looks for files under the /etc/marathon/conf/ folder, where each file is the name of the flag to be enabled and the content of the file is the flag values. So in our case, we need two files inside the “/etc/marathon/conf/”, 1) event_subscriber and 2) http_endpoints

root@vagrant-ubuntu-trusty-64:/etc/marathon/conf# ls /etc/marathon/conf/
event_subscriber  http_endpoints

And the content of these files are,

root@vagrant-ubuntu-trusty-64:/etc/marathon/conf# cat /etc/marathon/conf/event_subscriber
http_callback

root@vagrant-ubuntu-trusty-64:/etc/marathon/conf# cat /etc/marathon/conf/http_endpoints
http://localhost:1234/

Now for test purpose, lets start a minimal webserver using netcat listening to port 1234

$ nc -l -p 1234

Now netcat is listening on port 1234. Lets restart Marathon with our new flags.

$ service marathon restart

Now lets check if the flags are enabled properly,

root@vagrant-ubuntu-trusty-64:/etc/marathon/conf# ps axf | grep marathon | grep -v grep
2780 ?        Sl     0:36 java -Xmx512m -Djava.library.path=/usr/local/lib -Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n -cp /usr/local/bin/marathon mesosphere.marathon.Main --zk zk://localhost:2181,localhost:2182,localhost:2183/marathon --master zk://localhost:2181,localhost:2182,localhost:2183/mesos --http_endpoints http://localhost:1234/ --event_subscriber http_callback
2791 ?        S      0:00  \_ logger -p user.info -t marathon[2780]
2792 ?        S      0:00  \_ logger -p user.notice -t marathon[2780]

From the above ps command, it can be seen that the flags are enabled properly. Now i’m going to start a docker container. So as per the Event Bus documentation, Callbacks are Fired every time Marathon receives an API request that modifies an app (create, update, delete)

$ curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d@ubuntu.json

where ubuntu.json is,

{
    "container": {
    "image": "docker:///libmesos/ubuntu",
    "options" : []
  },
  "id": "ubuntu2",
  "instances": "1",
  "cpus": ".3",
  "mem": "200",
  "uris": [ ],
  "ports": [9999],
  "cmd": "while sleep 10; do date -u +%T; done"
}

Once the Marathon API is fired, we will get a POST request on our netcat with the Event details. Below is the Event received on my netcat server.

root@vagrant-ubuntu-trusty-64:/var/tmp# nc -l -p 1234
POST / HTTP/1.1
Host: localhost:1234
Accept: application/json
User-Agent: spray-can/1.2.1
Content-Type: application/json; charset=UTF-8
Content-Length: 427

{"clientIp":"127.0.0.1","uri":"/v2/apps","appDefinition":{"id":"ubuntu2","cmd":"while sleep 10; do date -u +%T; done","env":{},"instances":1,"cpus":0.3,"mem":200.0,"disk":0.0,"executor":"","constraints":[],"uris":[],"ports":[9999],"taskRateLimit":1.0,"container":{"image":"docker:///libmesos/ubuntu","options":[]},"healthChecks":[],"version":{"dateTime":{}}},"eventType":"api_post_event","timestamp":"2014-08-25T13:23:27.059Z"}

This Event bus is really usefull as it notifies us on all the Event changes happening on our Mesos cluster. We can also buit an Event notification system based on these callbacks. Though the current Events has very minimal details, i’m sure that more Event types will get added soon into Marathon.

Standard
Docker, Marathon, Mesos, Zookeeper

Managing HA Docker Cluster Using Multiple Mesos Masters

In my previous blog, i’ve described how to manage Docker cluster using Mesos and Marathon. It was using a single Mesos Master. But for production, we cannot go with a single Mesos master, as it will result in a sinlge point of failure. Mesos supports Multi master via Zookeeper. So in this blog i’m going to explain how to setup a Multi Master Mesos Cluster using Zookeper for Automatic promotion of Mesos master when the current Active Mesos Master fails. This time i’m going to setup 3 Zookeeper services and 2 Mesos Master on a single Vagrant box.

Setting up ZooKeeper Cluster

First lets download the the latest stable Zookeeper source code.

$ wget http://mirrors.ukfast.co.uk/sites/ftp.apache.org/zookeeper/stable/zookeeper-3.4.6.tar.gz

Now extract the tar file and create 3 copies of the same, say zookeeper1, zookeeper2 and zookeeper3. Also Zookeeper needs Java on the machine, so let’s install java dependencies.

$ apt-get install openjdk-7-jdk openjdk-7-jre

Now inside, the extracted zookeeper source folder, we need to create a config file. So in our case, inside each Zookeeper folder, we need a zoo.cfg file.

For zookeeper1 folder,

# content of zoo.cfg
tickTime=2000
dataDir=/var/lib/zookeeper1/
clientPort=2181
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

For zookeeper2 folder,

# content of zoo.cfg
tickTime=2000
dataDir=/var/lib/zookeeper2/
clientPort=2182
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

For zookeeper3 folder,

    # content of zoo.cfg
tickTime=2000
dataDir=/var/lib/zookeeper3/
clientPort=2183
initLimit=5
syncLimit=2
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

Here, since the 3 zookeeper services are running on the same host, i’ve assigned separate ports. If the zookeeper are running on separate instances, then we can have the same ports for all of the zookeeper nodes. There are two port numbers. The first followers use to connect to the leader, and the second is for leader election.

Now we can start the Zookeeper in Foreground using the ./bin/zkServer.sh start-foreground from each of the 3 zookeeper folders. netstat -nltp | grep 288 will display the port of the zookeeper service which is the current master among the cluster. We can also check the connectivity using the zookeepr client binary available in the zookeeper source folder. Once zookeeper cluster is UP, we can go ahead setting up Mesos.

Setting up Multiple Mesos Masters

Mesossphere team has already built packages for Mesos,Marathon and Deimos. So one of my Mesos master will be from the package. But i also wanted to play with the Mesos Source, so i decided to build Mesos from the source. So my second Mesos Master will be from the scratch.

First Master from the Mesosphere package.

$ apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF

$ DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')

$ CODENAME=$(lsb_release -cs)

# Add the repository
$ echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | sudo tee /etc/apt/sources.list.d/mesosphere.list

$ apt-get -y update

$ apt-get -y install mesos marathon deimos

Now we need to define the Zookeeper cluster details so that Mesos master and slave can connect. Edit /etc/mesos/zk and add zk://localhost:2181,localhost:2182,localhost:2183/mesos. More details about Zookeeper URL is available here. Also More details about installing Mesos from Mesosphere’s Debian package is available in their website.

So now we have one master ready, we can start the Mesos Master, Mesos Slave and Marathon. ANd esnure that they can connect to our Zookeeper cluster properly. We can also see the connection status in the stdout of the zookeeper process as they are running in foreground.

Now building the second Mesos master from scratch. First let’s download the Mesos Source Code.

$ wget http://archive.apache.org/dist/mesos/0.19.0/mesos-0.19.0.tar.gz

$ tar xvzf mesos-0.19.0.tar.gz && cd mesos-0.19.0

$ ./bootstrap && ./configure --prefix=/usr/local/mesos

$ make && make install

Once the Compilation is succeeded, we can start the new Mesos Master service. The default port 5050 is used by the existing master, so we need to run this new service on a different port.

$ /usr/local/mesos/sbin/mesos-master --zk=zk://localhost:2181,localhost:2182,localhost:2183/mesos --port=5054 --quorum=1 --registry=in_memory --work_dir=/var/lib/mesos/

Once the new Mesos Master is up, we can create a test container via Marathon Rest API. Create a simple json file called ubuntu.json

{
    "container": {
    "image": "docker:///libmesos/ubuntu",
    "options" : []
  },
  "id": "ubuntu",
  "instances": "1",
  "cpus": ".3",
  "mem": "200",
  "uris": [ ],
  "cmd": "while sleep 10; do date -u +%T; done"
}


$ curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps -d@ubuntu.json

This will launch a single container. We can check the status of the container via the Marathon UI as well as via RestAPI also. Now comes the critical part for Production, What happens when the master fails. Mesos Documentation says, if the current master fails, in a Multi master Mesos Cluster, Zookeeper will elect a new Master, in our case the second Mesos master service. And Mesos Claims that the running services will not get crashed and the status will be taken by the newly promoted master. In our case, we have a Docker container running as a long running service.

First i’ll stop one of the Mesos master service,

$ service mesos-master stop

Now immidiately on the stdout of the Zookeeper, we can see the logs corresponding to the Election process. Below is the same,

I0814 19:41:07.889044  9093 contender.cpp:243] New candidate (id='7') has entered the contest for leadership
2014-08-14 19:41:07,896:9088(0x7ff8e2ffd700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:2183], sessionId=0x347d606d0d90003, negotiated timeout=10000
I0814 19:41:07.899509  9092 group.cpp:310] Group process ((10)@10.0.2.15:5054) connected to ZooKeeper
I0814 19:41:07.900049  9092 group.cpp:784] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0814 19:41:07.900069  9092 group.cpp:382] Trying to create path '/mesos' in ZooKeeper
I0814 19:41:07.915621  9092 detector.cpp:135] Detected a new leader: (id='6')
I0814 19:41:07.915699  9092 group.cpp:655] Trying to get '/mesos/info_0000000006' in ZooKeeper
I0814 19:41:07.920266  9094 network.hpp:423] ZooKeeper group memberships changed
I0814 19:41:07.920910  9094 group.cpp:655] Trying to get '/mesos/log_replicas/0000000003' in ZooKeeper
I0814 19:41:07.924144  9095 network.hpp:461] ZooKeeper group PIDs: { log-replica(1)@10.0.2.15:5054 }
I0814 19:41:07.926575  9092 detector.cpp:377] A new leading master (UPID=master@10.0.2.15:5054) is detected
I0814 19:41:07.926679  9092 master.cpp:957] The newly elected leader is master@10.0.2.15:5054 with id 20140814-194033-251789322-5050-8808

As per the above logs, the second Mesos Master running at port 5054 was promoted as New Mesos Master. This will be reflected to Marathon also. We can go to the Marthon UI and make sure that the Docker process that we have started using the old Mesos master is still running fine. We can even try making some scaling changes and can make sure that the new master can alter the process.

In my testing, i’ve tried stoping each of the service and ensured that only one Mesos master is running and also tried scaling the Apps from one Master service to make sure that the new Master was able to keep track of all these changes. The results were quite promising. Mesos + Marrthon + Docker indeed is killer combo. We can really built a cross vendor independent cluster with scaling capabilites.

Standard