Ansible, Ubuntu

Stepping into Ansible

For the past 2 year’s, i played with config management tools like Puppet and Salt. But all these tools were mostly Client-Server Model, except Salt where it supports Push model also. But for the last 6 months, Ansible is gaining more popularity. Ansible is a Push model system which relies on SSH. So before i adopt Ansible completely, i decided to have a try. I need to make sure that the Ansible supports all basic features what other competitors supports. Which is really helpful in migration also.

Installation

Ansible is pretty easy to install. We can install it from source or via package managers or even via PIP.We can use the official ubuntu ppa for installing Ansible.

apt-get install software-properties-common
apt-add-repository ppa:ansible/ansible
apt-get update
apt-get install ansible

Since Ansible relies on SSH, things like Host Key verification errors will prevent the SSH connections resulting in failures. We can disable the Host Key Verfication check in the ansible.cfg file

host_key_checking = False      # add this option to the config file

or we can set an env variable export ANSIBLE_HOST_KEY_CHECKING=False for the current session. By default ansible uses the hosts file present in the ansible home directory. So we can define the static machines there. We can add either the IP or DNS resolvable FQDN. Once the IP/FQDN is added, we can test the connectivity via ping module. Make sure that the Ansible server’s SSH key is added to the authorized_keys on the remote machines.

ansible all -m ping

# Sample output

ansible-ubuntu | success >> {
    "changed": false,
    "ping": "pong"
  }

Managing Custom Facts

Config management tools like puppet/Salt supports custom facts to be defined on the remote machines. We can define the custom facts and the config management server can use these facts. Even though Ansible is an agentless server, we can define the custom facts on the remote systems. Whenever we query for facts, ansible connects to the remote machines and fetches the facts using its default library. But it also looks for custom facts in /etc/ansible/facts.d/. We need to put our custom facts file in this directory. The file has to be of .fact extension,must be executable and should return a valid JSON. This is in the case of a script. If we just want to define some facts directly, we can simple create a file like below

[myfact]
role=test
profile=staging

The above fact file will add two fact variables called role and profile with the value as mentioned in the file. Now let’s use the system module and see if we are able to retrieve the new custom facts.

ansible <remote_host_name> -m setup

Below is the part of the output showing the custom facts

"ansible_local": {
       "myfacts": {
        "myfact": {
           "profile": "staging",
           "role": "test"
           }
       }
    },

Managing Dynamic Inventory

In the Cloud environment, it’s difficult to maintain a static inventory. Ansible does supports Dynamic inventory for vendors including AWS EC2. Ansible provides us an Inventory script. We can also use this script directly and query EC2 to get the list of all instances. To successfully make an API call to AWS, we will need to configure Boto. The simplest is just to export two environment variables:

export AWS_ACCESS_KEY_ID='AK123'
export AWS_SECRET_ACCESS_KEY='abc123'

./ec2.py --list   # Displays the list of all instances

Now we have the inventory script ready. Let’s make some fact’s query to the ec2 instances. We can use many filters here like, Tags etc…

ansible tag_Name_test -i /etc/ansible/plugins/inventory/ec2.py -m ping # Querying all instances with tag "Name=test"

We can also use regex with these say like tag_Name_test*. For rackspace user’s there is an official module called rax that works perfectly with ansible

Enrcypting YAML Data files

This is an important feature that most of the config management system lacks. In most of the current systems, we need to define the sensitive data like say ssh-keys, API’s AuthID/Token etc… in plain text which increases the security risk. Ansible Vault comes for rescue here. Vault feature can encrypt any structured data file used by Ansible. This can include “group_vars/” or “host_vars/” inventory variables, variables loaded by “include_vars” or “vars_files”, or variable files passed on the ansible-playbook command line with “-e @file.yml” or “-e @file.json”. Role variables and defaults are also included!. While invoking any playbook, we can pass the --ask-vault-pass along the vault password, so Ansible can can decrypt the file and use its contents while performing any execution.

ansible-vault encrypt foo.yml    # Encrypting a file

ansible-vault edit foo.yml       # Editing encrpypted file

ansible-vault decrypt foo.yml    # Decrypting a file

Ansible indeed is truly an awesome product. It does have many new features like vault compared to its competitors. It’s backed by an awesome community. So we can expect more exciting features in future.

Standard
Ubuntu, FreeSwitch, Docker

Load Test on Docker Freeswitch – Part 1

Docker is a very powerfull tool for managing Linux containers. In my previous blog i’ve explaind on how to setup a Docker Freeswitch. Docker is very mature now, version 1.0 has already been released. Docker is now supported by all major cloud vendors. Docker was showing promising results when i was performing my initial testing. So this time i decided to perform a heavy load test on the Freeswitch container to ensure that Docker can really enter Telephony. Like any normal sys admin, i was googling for Freeswitch load test, and most of the results were pointing to Sipp, an Open Source test tool / traffic generator for the SIP protocol. For me Sipp didnt helped me as it started throwing errors beyond 320 simultaneous calls. The UDP connections were timing out. I tried increasing the timeout, which didn’t helped much.

So next choice is to use a Freeswitch itself, to generate calls. Using the FreeSwitch’s originate command to generate simultaneous calls and hit the Docker Freeswitch container. I also decided to collect all system metrics, so that i knows how the machine behaves under various load tests conditions. For this i deciced to use CollectD and Graphite combo. Collectd 5+ has an inbuild graphite plugin which can send the collectd metrics to a graphite server.

I’ve already setup an Ubuntu-Freeswitch Docker image. First we need to pull the images from the Docker hub.

$ docker pull deepakmdass88/fs-ubuntu

Now i’m going to start the Docker FreeSwitch container in foreground.

$ docker run --rm --privilieged -i -t -p 5060:5060/tcp -p 5060:5060/udp -p 16384:16384/udp -p 16385:16385/udp -p 16386:16386/udp -p 16387:16387/udp -p 16388:16388/udp -p 16389:16389/udp -p 16390:16390/udp -p 16391:16391/udp -p 16392:16392/udp -p 16393:16393/udp -p 5080:5080/tcp -p 5080:5080/udp deepakmdass88/fs-ubuntu /bin/bash

The privilieged option was enabled because, the FreeSwitch init script sets some custom ulimit values, so the container has to be given special privileges. Corresponding SIP and RTP ports are forwarded from the host to the container.

Now before starting the Freeswitch service, we can set up the CollectD agent. By default, the Ubuntu repostiry contains CollectD versio 4.10, but the Graphite plugin is available from version 5.0+ onwards. So we can use somne PPA which has the corresponding version available.

$ apt-get install python-software-properties

$ add-apt-repository ppa:joey-imbasciano/collectd5

$ apt-get update && apt-get install collectd

Now in the /etc/collectd.conf, uncomment LoadPlugin write_graphite. Also, in the same file and uncomment the plugin definition and fill in the server details.

<Plugin write_graphite>
  <Carbon>
        Host "dockergraphite.example.com"
        Port "2003"
        Protocol "tcp"
        LogSendErrors true
        Prefix "collectd."
        StoreRates true
        AlwaysAppendDS false
        EscapeCharacter "_"
  </Carbon>
</Plugin>

I’ve enabled a custom freeswitch plugin, which will extract the current ongoing calls count from freeswitch and sends it to the graphite server. Once the config changes are done we can restart the CollectD service. Now we can check our graphite UI to see if the default metrics like memory, load, cpu etc. are reaching the graphite server. Once CollectD-Graphite setup is ready, we can go ahead with our load test. So, once the call has reached the server, we need some Dialplan to continue the calls. So the simplest method is to create an infinite loop of playing some file, or some conference. Below are some dialplans that i’ve created in the public.xml

# Infinite Play Loop

 <extension name="111222333">
       <condition field="destination_number" expression="^111222333$">
         <action application="answer"/>
         <action application="playback" data="sounds/music/8000/got.wav"/>
         <action application="transfer" data="111222333 XML public"/>
       </condition>
    </extension>

# Test conference

  <extension name="docker-fs-test-conf">
    <condition field="destination_number" expression="^112233">
      <action application="answer"/>
      <action application="sleep" data="500"/>
      <action application="conference" data="docker-test@public"/>
    </condition>
  </extension>


# Default IVR menu

    <extension name="ivr_demo">
      <condition field="destination_number" expression="^5000$">
        <action application="answer"/>
        <action application="sleep" data="2000"/>
        <action application="ivr" data="demo_ivr"/>
      </condition>
    </extension>

Now, we have the dialplans ready, next is authentication. By default there are two ways, Digest auth and IP Whitelist. Here i’m going to use IP whitelist, so we need to whitelist our IP in the acl.conf file.

 <list name="domains" default="deny">
      <!-- domain= is special it scans the domain from the directory to build the ACL -->
      <node type="allow" domain="$${domain}"/>
      <node type="allow" cidr="xxx.xxx.xxx.xxx/32"/>                 # IP of FS from which we are going to send the calls
      <!-- use cidr= if you wish to allow ip ranges to this domains acl. -->
      <!-- <node type="allow" cidr="192.168.0.0/24"/> -->
 </list>

Now we can start the Freeswitch service.

$ /etc/init.d/freeswitch start

We can check the freeswich service using the fs_cli command.

$ /usr/local/freeswitch/bin/fs_cli -x "show status"

UP 0 years, 0 days, 6 hours, 34 minutes, 59 seconds, 648 milliseconds, 56 microseconds
FreeSWITCH (Version 1.5.13b git 39200cd 2014-07-02 21:55:21Z 64bit) is ready
1068 session(s) since startup
0 session(s) - peak 299, last 5min 0
0 session(s) per Sec out of max 30, peak 29, last 5min 0
1000 session(s) max
min idle cpu 0.00/100.00
Current Stack Size/Max 240K/8192K

Now freeswitch is ready to accept the connection. We can start sending the calls from our Load test freeswitch. Below is the script that was used to originate the calls from the load test Freeswitch machine. This will create simultaneous calls towards the Docker FS.

IP_URI="sip:111222333@<docker-fs-ip>:5060"
MAX_CALLS=$1

while [ 1 ]; do

set -i req
req=$(/usr/local/freeswitch/bin/fs_cli -q -b -x "show channels count" | awk '{print $1}')
if [ $req -lt $MAX_CALLS ]; then
    /usr/local/freeswitch/bin/fs_cli -q -b -x "bgapi originate sofia/external/$SIP_URI loadtest"
else
    echo "sleep a bit ..."
    sleep 10s

fi

While bulk calls are being made from the Load test freeswitch machines, to test the Quality in real time, it’s better to dial to the extension directly from a Sip Phone/Client and ensure that voice quality is good. Below is my Graphite dashboard for the load test.

Default Graphite UI

Tessera UI

The FS was stable till 500 simultaneous calls, after that there was a sudden drop in calls and also the voice quality started dropping and in a minute the Freeswitch crashed due to Segmentation fault. I’m going to analyze the core dump file to understand more about the crash. The other smaller drops that we see in the graph was caused by the Load test Freeswitch machine, as the load was getting high when the number of calls was increased. But 500 simultaneous calls are pretty decent and the there was no issue in voice quality till the number of calls crossed 500. Though it’s very difficult to make a final confirmation, i decided to go ahead with phase 2 load test.

In the phase 2 test, i’m planning to use multiple FS load test machines to generate large simultaneous calls + running 2 separate FS containers on the same host and split the incoming calls to both these containers. Once the phase 2 test is completed, ill share the test results in an another blog post. Docker is still under heavy development, and i’m sure Docker will be entering Telephony soon.

Standard
Docker, Marathon, Mesos, Ubuntu

Managing Docker Clusters Using Mesos and Marathon

Docker has became one of my favourite tool. It’s super cool and super easy tool to manage linux containers. LXC’s are around in IT world for some time, but by the entry of Docker last year, the wave started rising. Thanks to Docker team and Solomon Hykes for open sourcing such a wonderfull project. I’ve already mentioned a lot of stuffs about Docker in my previos blogs, so today im going explain how Docker can be used as a Cluster. There are some interesting tools like CoreOS, Helios etc for managing Docker as a cluster. But today i’m going to explain on how to set up a Docker cluster using Apache Mesos. CoreOS is a custom linux os which comes with SystemD. But the restriction is, we have to use that custom images of coreos.. Indeed CoreOS team open sourced some exciting tools like etcd fleet which works with CoreOS for managing Docker clusters. But Mesos is quite simple, we can install it via package, or even using tar balls available in thier Github repo onto most of the Linux Distro’s and it’s quite easy to configure also. Mesos is heavily used by Twitter to manage their data center’s. And now Mesosphere has opensourced a new tool called Mararthon which now provides a UI and a Rest API for maaging and scheduling Mesos Frameworks aka jobs, in this case containers as a service.

A few weeks ago, Mesos 0.19 was released which comes with an official support for Docker coantiners by intergrating Deimos into it. And a few days ago Marathon has released their new version 0.6.0 supports launching any task in a Docker container via Mesos 0.19+

Setting up Mesos Cluster

In this test setup, i’m going to setup both Mesos master/slave and Zookeeper on the same Ubuntu 14.04 vagrant node. First we can install the dependencies,

$ apt-get install curl python-setuptools python-pip python-dev python-protobuf

Now we can install Zookeeper

$ apt-get install zookeeperd

After the installation, ZooKeeper has 1 configuration. Each Zookeeper needs to know its position in the quorum.

$ echo 1 | sudo dd of=/var/lib/zookeeper/myid

Now we can setup Docker

$ echo "deb http://get.docker.io/ubuntu docker main" > /etc/apt/sources.list.d/docker.list

$ apt-get update && apt-get install lxc-docker

$ docker version

   Client version: 1.0.0
   Client API version: 1.12
   Go version (client): go1.2.1
   Git commit (client): 63fe64c
   Server version: 1.0.0
   Server API version: 1.12
   Go version (server): go1.2.1
   Git commit (server): 63fe64c

Let’s pull some basic ubuntu images from Docker Hub so that we can use the same for testing.

$ docker pull libmesos/ubuntu

Now we can configure Mesos

$ curl -fL http://downloads.mesosphere.io/master/ubuntu/14.04/mesos_0.19.0~ubuntu14.04%2B1_amd64.deb -o /tmp/mesos.deb

$ dpkg -i /tmp/mesos.deb

$ mkdir -p /etc/mesos-master

$ echo in_memory | sudo dd of=/etc/mesos-master/registry

## Mesos Python egg for use in authoring frameworks

$ curl -fL http://downloads.mesosphere.io/master/ubuntu/14.04/mesos-0.19.0_rc2-py2.7-linux-x86_64.egg -o /tmp/mesos.egg

$ easy_install /tmp/mesos.egg

We can download the latest Marathon 0.6 from here

$ tar xvzf marathon-0.6.0.tgz

Mesos uses Deimos for managing dockers, Deimos can installed via pip

$ pip install deimos

Also, we need to configure mesos to use Deimos,

$ mkdir -p /etc/mesos-slave

$ echo /usr/local/bin/deimos | sudo dd of=/etc/mesos-slave/containerizer_path

$ echo external | sudo dd of=/etc/mesos-slave/isolation

Now we can start all the services.

$ initctl reload-configuration

$ service docker start

$ service zookeeper start

$ service mesos-master start

$ service mesos-slave start

##### Starting Marathon #####

$ cd marathon-0.6.0

$ ./bin/start --master zk://localhost:2181/mesos --zk_hosts localhost:2181

Marathon will now start listening to port 8080, We can access the UI from the browser via this port, also via rest API using the same port.

curl localhost:8080/help   # gives us some details about the API's

I just went through the Deimos code, so under the hood they are using docker run with some default parameters like --sig-proxy, --rm, --cidfile, -v, -w and extra parameters that we are passing while creating the task via Marathon.

As of now, we still can’t pass details like Container image, Docker options via Marathon GUI. So we can use the Rest API for the time being. Below is a sample curl request for launcing a single container,

curl -X POST -H "Accept: application/json" -H "Content-Type: application/json" \
    localhost:8080/v2/apps -d '{
        "container": {"image": "docker:///libmesos/ubuntu", "options": ["--privileged"]},
        "cpus": 0.5,
        "cmd": "sleep 500",
        "id": "docker-tester",
        "instances": 1,
        "mem": 300
    }'

We can pass custom options to the docker run command via “options”. After making the curl request, we can check the syslog, as mesos will be logging into syslog by default. We can even see the Docker run command on the same.

Jun 27 07:24:58 vagrant-ubuntu-trusty-64 deimos[19227]: deimos.containerizer.docker.launch() exit 0 // docker run --sig-proxy --rm --cidfile /tmp/deimos/mesos/00d459fb-22ca-4af7-9a97-ef8a510905f2/cid -w /tmp/mesos-sandbox -v /tmp/deimos/mesos/00d459fb-22ca-4af7-9a97-ef8a510905f2/fs:/tmp/mesos-sandbox --privileged -p 31498:31498 -c 512 -m 300m -e PORT=31498 -e PORT0=31498 -e PORTS=31498 libmesos/ubuntu sh -c 'sleep 500'

We can also use the Marathon Rest API to check the status of the job which we started.

curl -X GET -H "Content-Type: application/json" localhost:8080/v2/apps

Below is the screenshort for the same from the Marathon UI.

We can also check if the container is launched via docker ps command.

A more detailed report about the Docker job which we have launched can be viewed via the default Mesos GUI listening on port 5050 on the Mesos master. Now we can test the scalability of the Job. Currently we have only one container running. So now we can try scaling say adding one more node. We can do it in two ways, like via PUT request using curl or using GUI

curl -X PUT -H "Content-Type: application/json" localhost:8080/v2/apps/docker-tester \
    "container": {"image": "docker:///libmesos/ubuntu", "options": ["--privileged"]},
            "cpus": 0.5,
            "cmd": "sleep 500",
            "id": "docker-tester",
            "instances": 2,     # increasing the instance count to 2
            "mem": 300
            }'

Now we can use the docker ps command to see if the new container is launched or not. Also we can see that status in UI also.

Similarly, we can scale down also. I’ve tested the same and all seems to be good. Marathon ensures that the docker process will be running. So incase if the process crashes Marathon will restart the same and ensures that the instances are up and running as per our configuration. There are a few other Open Sourced Mesos Scheduler’s like Apache Aurora, Airbnb’s Chronos. But for my requirement marathon is pretty straight and simple and also provides a very good Rest API layer for managing containers. Mesos, Marathon and Docker are still young, but provides a killer combination for managing clusters built over Docker containers.

Standard
Ubuntu, FreeSwitch, Voip, Docker

Dockerizing FreeSwitch – Docker Enters Telephony World

Docker has became one of the hottest topics in IT now a days. Docker is an open-source project that automates the deployment of applications inside software containers. Docker extends a common container format called Linux Containers (LXC), with a high-level API providing lightweight virtualization that runs processes in isolation.Docker uses LXC, cgroups, and the Linux kernel itself. Though i coudn’t make out to the DockerCon 2014 in SF, a lot new developments were announced on the DockerCon. Especially three new Opensource Projects libcontainer, libchan and libswarn. Docker is indeed creating a revolution in the container space, creating a next generation of scalable platform management. There are a lot PAAS services like Deis, resin.io, Dokku which are already using Docker in production. Another important and exciting project is CoreOS. CoreOS uses tools like SystemD, Fleet, EtcD to build a fully scalabale docker based cluster management system. I definitely need a separate blog to write about CoreOS, it’s really a super exciting project to play with.

Last week Docker Team released Version 1.0 of Docker. So i’ll be using the same in this new set up. It’s been almost 6 Month’s since i’ve been working @ Plivo as a DevOps Engineer. Telephony was really a very new platform for me. And my first companion was offcourse FreeSwitch,a scalable open source cross-platform telephony platform designed to route and interconnect popular communication protocols using audio, video, text or any other form of media. I was heavily using Vagrant for all my experiments in my mac. But after started using Docker, it really made me crazy. I’ve played for some time wiht LXC’s long back. So this was like a leap back to the container world.

There are a lot of concerns on using Virtual Machines in Telephony world. Especially for the server’s that handles the Real Time voice packets, as voice quality is pretty important in Telephony. Docker’s again more light weight isolated environment, and i decide to see how Docker can perform with such issues. If Docker handle Freeswitch smoothly, then i’m sure that we can use Docker for other telephony app’s like OpenSIPS/Kamailio etc, as they handle only sessions not the Media traffic. I know there are a lot of concerns like CPU load, Network etc, but this is like an initial move to test Docker into Telephony.

Setting Up Docker

Docker 1.0 is available from the Official Docker repo.

$ echo "deb http://get.docker.io/ubuntu docker main" > /etc/apt/sources.list.d/docker.list

$ apt-get update && apt-get install lxc-docker

Now we can check the Docker version using the docker binary itself.

$ docker version

Client version: 1.0.0
Client API version: 1.12
Go version (client): go1.2.1
Git commit (client): 63fe64c
Server version: 1.0.0
Server API version: 1.12
Go version (server): go1.2.1
Git commit (server): 63fe64c

Now Docker is installed, but we need some OS images to use with docker. We can build custom images using debootstrap etc. But there are official minimal images available in Docker HUB. We can search for the repositories and can pull those images via docker binary itself.

For example to pull the entire Ubuntu images, we can just do,

$ docker pull ubuntu

But this will download all the ubuntu images available in the repo. We can also do selective download by using the tag.

$ docker pull ubuntu:14:04

Once the images are downloaded, we can use images option in docker binary to see all the downloaded images.

$ docker images

REPOSITORY                      TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ubuntu                          14.04               ad892dd21d60        10 days ago         275.5 MB

Here i’m not going to daemonize the container, i’ll be using the interactive option. But first, let’s start a new container.

$ docker run -t -i ubuntu:14.04 /bin/bash

This command will start a conatiner and will open up a bash session for us and we will be inside the bash session. Now to use an application we need to open up corresponding ports to outside world. We can use the “-p” option while starting a docker container to enable port forwarding. Under the hood, docker is using IPtables for the same. In the case of Freeswitch, we need to open 5060,5080 for the default Sofia profiles (Internal and External). Also we need to open the RTP ports. In this test i’ll be opening a predefined set of ports ie from “16384” to “16394”. (As my Docker host resides on Azure, creating an Endpoint for each port forward is really a pain, so i decided to open only a few). And also i’ll be opening port 22, so that we can have an ssh server inside the container.

$ docker run -t -i -p 2223:22 -p 5060:5060/tcp -p 5060:5060/udp -p 16384:16384/udp -p 16385:16385/udp -p 16386:16386/udp -p 16387:16387/udp -p 16388:16388/udp -p 16389:16389/udp -p 16390:16390/udp -p 16391:16391/udp -p 16392:16392/udp -p 16393:16393/udp -p 5080:5080/tcp -p 5080:5080/udp ubuntu:14.04 /bin/bash

This will start a new container and Docker by default will setup the IPtables for port forwarding. So now my IPtables looks like this.

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
   43 16850 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:5080
    0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           tcp dpt:5080
  988  198K ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16392
    0     0 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16389
    0     0 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16385
    0     0 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16393
 2026  405K ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16388
 8817 1763K ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16384
12144 8684K ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:5060
 4359  257K ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           tcp dpt:5060
 9917 1983K ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16390
    0     0 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16387
    0     0 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           tcp dpt:22
   38  4848 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16391
    1   152 ACCEPT     udp  --  !docker0 docker0  0.0.0.0/0            172.17.0.6           udp dpt:16386
    0     0 ACCEPT     all  --  *      lxcbr0  0.0.0.0/0            0.0.0.0/0
    0     0 ACCEPT     all  --  lxcbr0 *       0.0.0.0/0            0.0.0.0/0
 431K  630M ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
 128K   19M ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0
   16  2460 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0

Now we can go ahead with Freeswitch compilation. In my previous blog, i’ve mentioned how to compile and set up freeswitch. Once freeswitch is ready, we need to make a few changes. By default, Freeswitch uses STUN to route through NAT, but this doesn’t work with Docker. So we have to set the external IP manually. In the Freeswitch installed folder, edit conf/autoload_configs/switch.conf.xml. In this file we can set the External IP manually. Add the below lines to switch_conf.xml.

<X-PRE-PROCESS cmd="set" data="external_sip_ip=<YOUR_EXTERNAL_IP>"/>
<X-PRE-PROCESS cmd="set" data="external_rtp_ip=<YOUR_EXTERNAL_IP>"/>

Also we need to modify the Default Sofia Profiles and need to set the ext-rtp-ip and ext-sip-ip to use our external IP added in the switch_conf.xml file while establishing connections. Add the below lines to the conf/sip_profiles/internal.xml and conf/sip_profiles/external.xml

<param name="ext-rtp-ip" value="$${external_rtp_ip}"/>
<param name="ext-sip-ip" value="$${external_sip_ip}"/>

Now we need to set teh RTP ip range to the range which we have forwarded while creting the container. So we need to edit conf/autoload_configs/switch.conf.xml

<param name="rtp-start-port" value="16384"/>
<param name="rtp-end-port" value="16394"/>

Once the changes are made, we can start the FreeSwitch service. Now to make sure that the External IP is working properly, we can check the sofia profile status using fs_cli. below is a sample output of the sofia profile status.

freeswitch@internal> sofia status profile internal
=================================================================================================
Name                internal
Domain Name         N/A
Auto-NAT            false
DBName              sofia_reg_internal
Pres Hosts          172.17.0.6,172.17.0.6
Dialplan            XML
Context             public
Challenge Realm     auto_from
RTP-IP              172.17.0.6
Ext-RTP-IP          <my_external_ip>
SIP-IP              172.17.0.6
Ext-SIP-IP          <my_external_ip>
URL                 sip:mod_sofia@<my_external_ip>:5060
BIND-URL            sip:mod_sofia@<my_external_ip>:5060;maddr=172.17.0.6;transport=udp,tcp
HOLD-MUSIC          local_stream://moh
OUTBOUND-PROXY      N/A
CODECS IN           OPUS,G722,PCMU,PCMA,GSM
CODECS OUT          OPUS,G722,PCMU,PCMA,GSM
TEL-EVENT           101
DTMF-MODE           rfc2833
CNG                 13
SESSION-TO          0
MAX-DIALOG          0
NOMEDIA             false
LATE-NEG            true
PROXY-MEDIA         false
ZRTP-PASSTHRU       true
AGGRESSIVENAT       false
CALLS-IN            0
FAILED-CALLS-IN     0
CALLS-OUT           0
FAILED-CALLS-OUT    0
REGISTRATIONS       1

Now freeswitch ahs started successfully. We can test some basic calls using softphones like Xlite, Telephone etc. By default, there are some default extensions and user’s available, so we can use the same for testing the calls. But i really wanted to try trunkning also and wanted to see the quality of the voice. So i created SIP trunking in Freeswitch using Plivo. And i tested a couple of calls to US and India DID’s and no issues were detected in the quality. But again i need to test the laod of the server’s when it startes handling concurrent calls and also the voice quality. But i decied to d oit as a Phase II. But as of now, Docker FreeSwitch is working perfectly like a physical machine with out issues.

So now we have a working FreeSwitch container, now here comes the main advantage of the Docker. We can create a new image with all these changes, so that nex time i dont need to work from scratch. I can use this saved image and a readymade Docker Freeswitch container can be launched in seconds. Since we are in interactive mode, we should not quit the session before it’s saved or else all the things will be lost,becoz dokcer will destroy the same. So open up a new shell on the docker host and use the commit option. But to use the commit command, we need to know the container id, so here docker ps command comes handy.

$ docker ps

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS           NAMES

e7f3c02346d4        a4196763d248        /bin/bash           32 hours ago        Up 32 hours         0.0.0.0:2223->22/tcp, 0.0.0.0:5060->5060/tcp, 0.0.0.0:5060->5060/udp, 0.0.0.0:5080->5080/tcp, 0.0.0.0:5080->5080/udp, 0.0.0.0:16384->16384/udp, 0.0.0.0:16385->16385/udp, 0.0.0.0:16386->16386/udp, 0.0.0.0:16387->16387/udp, 0.0.0.0:16388->16388/udp, 0.0.0.0:16389->16389/udp, 0.0.0.0:16390->16390/udp, 0.0.0.0:16391->16391/udp, 0.0.0.0:16392->16392/udp, 0.0.0.0:16393->16393/udp   silly_turing

In my case “e7f3c02346d4” is the container ID. So i can use the same for commit. I won’t be commiting to the base Ubuntu image, as i can use the same for other purposes, so here i’ll commiting to a new image say “ubntu-fs-docker”

$ docker commit -m "<commit message>" e7f3c02346d4 ubntu-fs-docker

Now we can use this “ubntu-fs-docker” image to launch a ready made FreeSwitch server’s.

Docker is a very juvenile project about more than a year old. But the use cases are expanding heavily in the Modern IT world. Docker is fueling up a new generation of scalable servers. Wishing all the best for Docker and kudos to Solomon Hykes and the DotCloud team for opensourcing such a powerfull project

Standard
Elasticsearch, Kibana, logstash, Monitoring, Plivo, SIP, Ubuntu, Voip

Extending ELK Stack to VOIP Infrastructure

Being a DevOps guy, i always love metrics. Visualized metrics gives a good picture of what’s happening in our live battle stations. There are now a quite lot of Open Source tools for monitoring and visualizing. It’s more than a year since i’ve started using Logstash. It never turned me down. ElasticSearch-Logstash-Kibana (ELK) is a killer combination. Though i started Elasticsearch + Logstash as a log analyzer, later StatsD and Graphite took it to the next level. When we have a simple infrastructure it’s easy to monitor. But when the infra starts scaling, it becomes quite difficult to keep track of all the events happening inside each nodes. Though service checks can help, but there is still limitation for it. I faced a lot of scenarios where things breaks but service checks will be fine. Under such scenarios logs are the only hope. They have all these events captured.

At Plivo, we manage a variety of servers from SIP, Media, Proxy, WebServers, DB’s etc. Being a fully Cloud based system, i really wanted to have a system which can keep track of all the live events/status of what’s really happening inside our infra. So my plan was to collect two important stats, 1) Server’s events 2) Application events.

Collectd and Logstash

Collectd is a daemon which collects system performance statistics periodically. Since we have a lot Server’s which handle Realtime Media, it’s a very critical component for us. We need to ensure that the server’s are not getting overloaded and there is no latency in network. I’ve been using Logstash heavily for stashing all my logs. And there is a stable input plugin for collectd to send the all the system metrics to logstash.

First we need to enable the Network Plugin, and then we need to mention our Logstash server IP and port so that collectd can start injecting metrics. Below is a sample colectd configuration.

Hostname    "test.plivo.com"
Interval 10
Timeout 4
Include "/etc/collectd/filters.conf"
Include "/etc/collectd/thresholds.conf"
ReportStats true
    LogLevel info
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin network
<Plugin interface>
    Interface "eth0"
    IgnoreSelected false
</Plugin>
<Plugin network>
    Server "{logstash_server_ip}" "logstash_server_port"    # if no port number is mentioned, it will take the default port number (25826)
</Plugin>

Now on the Logstash server, we need to add the CollectD plugin on to the input filter in the logstash’s config file.

input {
      collectd {
      port => "5555"    # default port is 25826
      }
}

Now we are set. Based the plugins enabled in the collectd config file, collctd will start sending the metrics to Logstash on the Interval mentioned in the config, default is 10s. So in my case, i wanted the Load, CPU usage, Memory usage, Bandiwdth (TX and RX) etc. There are default plugins for all these metrics, which we can just enable it in the config file. We also had some custom plugins to collect some custom metrics. BTW writing custom plugin is pretty easy in Collectd.

Now using the Logstash’s Elasticsearch output plugin, we can keep these metrics in Elasticsearch. Now this where Kibana comes in. We can start visualizing these metrics via Kibana. We need to create a custom Lucene Query. Once we have the query, we can create a custom histogram’s for each of these queries. Below aresome sample Lucene queries that we can use with Kibana.

For Load -> collectd_type:"load" AND host:"test.plivo.com"
For Network usage -> collectd_type:"if_octets" AND host:"test.plivo.com"

Below is the screenshot of histogram for Load and Network (TX and RX)

Log Events

Now next is to collect the events from the application logs. We use SIP protocol for all our VOIP sessions. So all our SIP server’s are very critical for us. SIP is pretty similar to HTTP. The response codes are very similar to HTTP responses, ie 1xx, 2xx, 3xx, 4xx, 5xx, 6xx. So i wrote some custom grok patterns so keep track of all of these responses and stores the same on the Elasticsearch.

The second stats which i was interested was our SIP registrar server. We provide SIP endpoints to our customers so that they can use the same with SIP/Soft phones. So i was more interested on stats like Number of registrations/sec, Auth error rates. Plus using ElasticSearch’s MAP facet’s i can create BetterMap. In my previous blog post’s i’ve mentioned on how to create these bettermaps using Kibana and Elasticsearch. Below bettermap screenshot shows us the SIP endpoint registrations from various locations in the last 2 hours.

Now using the Kibana we can start visualizing all these data’s. Below is a sample of Dashboard that i’ve created using Kibana.

ELK stack proved to be an amazing combination. We are currently injecting 3 million events every day and ElasticSearch was blazingly fast in indexing all theses.

Standard
logstash, Monitoring, Ubuntu

Real Time Web-Monitoring Using Lumberjack-Logstash-Statsd-Graphite

For the last few days i was playing around with my two of my favourite tools Logstash and StatsD. Logstash, StatsD, Graphite together makes a killer combination. So i decided to test this combination along with Lumberjack for Real time Monitoring. I’m going to use, Lumberjack as the log shipper from the webserver, and then Logstash will stash the log’s porperly and and using the statsd output plugin i will ship the metrics to Graphite. In my previous blog, i’ve explained how to use Lumberjack with Logstash. Lumberjack will be watching my test web server’s access logs.

By default, i’m using the combined apache log format, but it doesnot have the original response time for each request as well as the total reponse time. So we need to modify the LogFormat, in order to add the two. Below is the LogFormat which i’m using for my test setup.

LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D %>D" combined

Once the LogFormat is modified, restart the apache service in order to make the change to be effective.

Setting up Logstash Server

First Download the latest Logstash Jar file from the Logstash site. Now we need to create a logstash conf file. By default there is a grok pattern available for apache log called “COMBINEDAPACHELOG”, but since we have added the tow new fields for the response time, we need to add the same for grok pattern also. So below is a pattern which is going to be used with Logstash.

pattern => "%{COMBINEDAPACHELOG} %{NUMBER:resptime} %{NUMBER:resptimefull}"

So the Logstash conf file will look like this,

input {
      lumberjack {
        type => "apache-access"
        port => 4444
        ssl_certificate => "/etc/ssl/logstash.pub"
        ssl_key => "/etc/ssl/logstash.key"
  }
}

filter {
  grok {
        type => "apache-access"
    pattern => "%{COMBINEDAPACHELOG} %{NUMBER:resptime} %{NUMBER:resptimefull}"
  }
}

output {
  stdout {
    debug => true
      }
  statsd {
    type => "apache-access"
    host => "localhost"
    port => 8125
    debug => true
    timing => [ "apache.servetime", "%{resptimefull}" ]
    increment => "apache.response.%{response}"
  }
}

Setting up STATSD

Now we can start setting up the StatsD daemon. By default, Ubuntu’s latest OS ships with newer verision of NodeJS and NPM. So we can install it using APT/Aptitude.

$ apt-get install nodejs npm

Now clone the StatsD github repository to the local machine.

$ git clone git://github.com/etsy/statsd.git

Now create a local config file “localConfig.js” with the below contents.

{
graphitePort: 2003
, graphiteHost: "127.0.0.1"
, port: 8125
}

Now we can start the StatsD daemon.

$ node /opt/statsd/stats.js /opt/statsd/localConfig.js

The above command will start the StatsD in foreground. Now we can go ahead with setting up the Graphite.

Setting up Graphite

First, let’s install the basic python dependencies.

$ apt-get install python-software-properties memcached python-dev python-pip sqlite3 libcairo2 libcairo2-dev python-cairo pkg-config

Then, we can start installing Carbon and Graphite dependencies.

        cat >> /tmp/graphite_reqs.txt << EOF
        django==1.3
        python-memcached
        django-tagging
        twisted
        whisper==0.9.9
        carbon==0.9.9
        graphite-web==0.9.9
        EOF

$  pip install -r /tmp/graphite_reqs.txt

Now we can configure Carbon.

$ cd /opt/graphite/conf/

$ cp carbon.conf.example carbon.conf

Now we need to create a storage schema.

        cat >> /tmp/storage-schemas.conf << EOF
        # Schema definitions for Whisper files. Entries are scanned in order,
        # and first match wins. This file is scanned for changes every 60 seconds.
        # [name]
        # pattern = regex
        # retentions = timePerPoint:timeToStore, timePerPoint:timeToStore
        [stats]
        priority = 110
        pattern = ^stats\..*
        retentions = 10s:6h,1m:7d,10m:1y
        EOF


$ cp /tmp/storage-schemas.conf /opt/graphite/conf/storage-schemas.conf

Also we need to create a log directory for graphite.

$ mkdir -p /opt/graphite/storage/log/webapp

Now we need to copy over the local settings file and initialize database

$ cd /opt/graphite/webapp/graphite/

$ cp local_settings.py.example local_settings.py

$ python manage.py syncdb

Fill in the necessary details including the super user details while initializing the database. Once the database is initialized we can start the carbon cache and graphite webgui.

$ /opt/graphite/bin/carbon-cache.py start

$ /opt/graphite/bin/run-graphite-devel-server.py /opt/graphite

Now we can access the dashboard using the url, “http://ip-address:8080&#8221;. Once we have started the carbon cache, we can start the Logstash server.

$ java -jar logstash-1.1.13-flatjar.jar agent -f logstash.conf -v

Once the logstash has loaded all the plugins successfully, we can start shipping logs from the test webserver using Lumberjack. Since i’ve enabled the STDOUT plugin, i can see the output coming from the Logstash server. Now we can start accessing the real time graph’s from graphite gui. There are several other alternative for the Graphite GUI like Graphene, Graphiti, Graphitus, GDash. Anyways Logstash-StatsD-Graphite proves to be a wonderfull combination. Sorry that i could not upload any screenshot for now, but i will upload soon

Standard