Debian, logstash

Using Riak with Logstash

For the last few days i was playing around with Riak, a distributed database. It’s very simple to configure and use and offcourse it supports MapReduce. I wanted to try out the map reduce, and since logstash has a plugin to write data into riak,  i decided to use it with logstash on an Ubuntu 12.04 machine.

Configuring Riak

Installing Riak is very simple, it has only a few dependencies.

apt-get install  libssl0.9.8  erlang”

Once the dependencies are being installed, we have to download and install the deb package of Riak from its website.


dpkg -i  riak_1.2.1-1_i386.deb”.

Once Riak is installed, go to “/etc/riak”, where the config files are available. We can change the name of the riak node by editing the “vm.args” file. By default Riak will listen to “”, but we can change this by editing “app.config” file.  In order to use enable https enable, we need to uncomment the https section in Riak core config. We also have to mention the path of the server key and certificate. Riak comes with a build in Admin console, which currently has very minimal functions. It shows the status of the riak nodes as well as the members in the riak ring. To enable this, open the “app.config”  go to “riak_control_config” and change the “enabled,false” to “enabled,true”. The user name and password can be mentioned in the userlist option.

If we have multiple machine we can create a riak cluster using riak-admin tool. Currently i’ve only one machine with Riak installed.

In Riak, data’s are stored in “Buckets”. A Bucket is a container and keyspace for data stored in Riak, with a set of common properties for its contents (the number of replicas, or n_val, for instance). Buckets are accessed at the top of the URL hierarchy under “riak”, e.g. /riak/bucket.

Configuring Logstash

Now we have Riak machine, listening on port port “8098”. Now we need to configure logstash to sendthe data to the riak. This is very simple because logstash has an output plugin which can directly write to riak.

In the output section of logstash config file, add the riak output plugin.  It should be like this,

” riak {

bucket => bucketname

type => typename

nodes => [“riakserverip”,”8098″,”riakserverip”,”8098″]


In the nodes section we have to mention the riak node ip’s. Since i’ve only one riak node, i’m mentioning the same ip twice.

That’s it, now we need to start logstash, then logstash will start writing data into the bucket which we mentioned in the conf file.

There is one good GUI for Riak called “rekon“. Just get  the source code from github and edit the “” and change the ip mentioned in to the ip which riak listens to and execute it. Now we can access the GUI using the below url


Using this we can see the buckets inside the riak and also the corresponding key values.

Now testing the “Map Reduce Function”

This is one of the main features of Riak. For example i’m going to write a map reduce function that will display all the keys in my bucket that has the keyword “mylinux”, which is the hostname of my machine. This function will return the key as well as the number of occurrences. Below is a simple MapReduce function.

“source”:”function(riakObject) {
var m = riakObject.values[0].data.match(\”mylinux\”);
return [[riakObject.key, (m ? m.length : 0 )]];

To execute the map reduce function, execute the following command,

curl -X POST -H ‘Content-Type: application/json’ -d ‘{
“source”:”function(riakObject) {
var m = riakObject.values[0].data.match(\”mylinux\”);
return [[riakObject.key, (m ? m.length : 0 )]];

The above command will return all the keys which has the keyword “mylinux” along with number of occurrences.