logstash, Monitoring, Ubuntu

Real Time Web-Monitoring Using Lumberjack-Logstash-Statsd-Graphite

For the last few days i was playing around with my two of my favourite tools Logstash and StatsD. Logstash, StatsD, Graphite together makes a killer combination. So i decided to test this combination along with Lumberjack for Real time Monitoring. I’m going to use, Lumberjack as the log shipper from the webserver, and then Logstash will stash the log’s porperly and and using the statsd output plugin i will ship the metrics to Graphite. In my previous blog, i’ve explained how to use Lumberjack with Logstash. Lumberjack will be watching my test web server’s access logs.

By default, i’m using the combined apache log format, but it doesnot have the original response time for each request as well as the total reponse time. So we need to modify the LogFormat, in order to add the two. Below is the LogFormat which i’m using for my test setup.

LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D %>D" combined

Once the LogFormat is modified, restart the apache service in order to make the change to be effective.

Setting up Logstash Server

First Download the latest Logstash Jar file from the Logstash site. Now we need to create a logstash conf file. By default there is a grok pattern available for apache log called “COMBINEDAPACHELOG”, but since we have added the tow new fields for the response time, we need to add the same for grok pattern also. So below is a pattern which is going to be used with Logstash.

pattern => "%{COMBINEDAPACHELOG} %{NUMBER:resptime} %{NUMBER:resptimefull}"

So the Logstash conf file will look like this,

input {
      lumberjack {
        type => "apache-access"
        port => 4444
        ssl_certificate => "/etc/ssl/logstash.pub"
        ssl_key => "/etc/ssl/logstash.key"

filter {
  grok {
        type => "apache-access"
    pattern => "%{COMBINEDAPACHELOG} %{NUMBER:resptime} %{NUMBER:resptimefull}"

output {
  stdout {
    debug => true
  statsd {
    type => "apache-access"
    host => "localhost"
    port => 8125
    debug => true
    timing => [ "apache.servetime", "%{resptimefull}" ]
    increment => "apache.response.%{response}"

Setting up STATSD

Now we can start setting up the StatsD daemon. By default, Ubuntu’s latest OS ships with newer verision of NodeJS and NPM. So we can install it using APT/Aptitude.

$ apt-get install nodejs npm

Now clone the StatsD github repository to the local machine.

$ git clone git://github.com/etsy/statsd.git

Now create a local config file “localConfig.js” with the below contents.

graphitePort: 2003
, graphiteHost: ""
, port: 8125

Now we can start the StatsD daemon.

$ node /opt/statsd/stats.js /opt/statsd/localConfig.js

The above command will start the StatsD in foreground. Now we can go ahead with setting up the Graphite.

Setting up Graphite

First, let’s install the basic python dependencies.

$ apt-get install python-software-properties memcached python-dev python-pip sqlite3 libcairo2 libcairo2-dev python-cairo pkg-config

Then, we can start installing Carbon and Graphite dependencies.

        cat >> /tmp/graphite_reqs.txt << EOF

$  pip install -r /tmp/graphite_reqs.txt

Now we can configure Carbon.

$ cd /opt/graphite/conf/

$ cp carbon.conf.example carbon.conf

Now we need to create a storage schema.

        cat >> /tmp/storage-schemas.conf << EOF
        # Schema definitions for Whisper files. Entries are scanned in order,
        # and first match wins. This file is scanned for changes every 60 seconds.
        # [name]
        # pattern = regex
        # retentions = timePerPoint:timeToStore, timePerPoint:timeToStore
        priority = 110
        pattern = ^stats\..*
        retentions = 10s:6h,1m:7d,10m:1y

$ cp /tmp/storage-schemas.conf /opt/graphite/conf/storage-schemas.conf

Also we need to create a log directory for graphite.

$ mkdir -p /opt/graphite/storage/log/webapp

Now we need to copy over the local settings file and initialize database

$ cd /opt/graphite/webapp/graphite/

$ cp local_settings.py.example local_settings.py

$ python manage.py syncdb

Fill in the necessary details including the super user details while initializing the database. Once the database is initialized we can start the carbon cache and graphite webgui.

$ /opt/graphite/bin/carbon-cache.py start

$ /opt/graphite/bin/run-graphite-devel-server.py /opt/graphite

Now we can access the dashboard using the url, “http://ip-address:8080&#8221;. Once we have started the carbon cache, we can start the Logstash server.

$ java -jar logstash-1.1.13-flatjar.jar agent -f logstash.conf -v

Once the logstash has loaded all the plugins successfully, we can start shipping logs from the test webserver using Lumberjack. Since i’ve enabled the STDOUT plugin, i can see the output coming from the Logstash server. Now we can start accessing the real time graph’s from graphite gui. There are several other alternative for the Graphite GUI like Graphene, Graphiti, Graphitus, GDash. Anyways Logstash-StatsD-Graphite proves to be a wonderfull combination. Sorry that i could not upload any screenshot for now, but i will upload soon


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s