Redis, Sinatra, TwitterAPI

TweetGrabber – a Live Tweet Capturer using REDIS+SINATRA+TwitterAPI

Twitter has become one of the powerfull social networking site. And people do share a lot of good info’s here. Even people reports outages of various sites and services in Twitter. Many of the tech companies do have a valid accounts to keep track of user’s comments on their services/products and they even interacts with user’s. Website’s like “downrightnow.com”, uses Twitter as an information source to identify the outage of many famous websites and services. Even in our personal site’s we use Javascripts to display our tweets and other statuses. But since the Twitter API version 1.1, which needs ouath, many of the jquery plugins became obsolete. This time i got a requirement to built a system that keeps track of tweets based on custom keywords. But i wanted to shaw all the tweets in a category basis, but on a sinlge page, with some scrolling effects, at the same time the sroller should keeps on updating with tweets on a regular interval basis. Which means i’m more concerned about the new trends going on in the Twitter.

Since i’m a Rubyist, i decided to build a Ruby app with Sinatra web Frontend, one my favourite web frameworks. I’m hard lover of Redis, faster since it runs on ram, and i dont want to keep my old tweets. So, My app is pretty simple, there will a Feeder which will talk to Twitter’s API and get the Tweet’s these tweet’s will be then stored in Redis, The Sinatra Fronend will fetch the tweets, and will display it in a scrolling fashion on the front end. Since i’m a CLI junkie, i’m not familiar with HTMLs and UI’s, so i decided to go with Twitter Bootstrap to build the HTML pages.

There is a Ruby gem called ”TweetStream” which works very well with Twitter API v1.1. So i’m going to use this gem to talk to Twitter. Below is the simple architecture diagram for my app.

Let’s see each components in detail.

Feeder

Feeder is a ruby script which constantly talks to Twitter API and grabs the latest streaming tweets based on the search keywords. Add all the grabbed tweets are then stored into the Redis database. Since i’ve to display teets corresponding to each keyword in separate scrolling fashions, i’m using separate redis database for each. So the Feeder has multiple ruby methods, where each method will be used for each keyword and writes into the corresponding Redis DB. Below is one of the Ruby Method in the Feeder script.

                        #######  FEEDER ####### 

TweetStream.configure do |config|
  config.consumer_key       = 'xxxxxxxxxxxxxx'          => All these cosnumerkey,secret and oauth tokens have to be generated from
  config.consumer_secret    = 'xxxxxxxxxxxxxx'             the Twitter's API site, dev.twitter.com
  config.oauth_token        = 'xxxxxxxxxxxxxx'
  config.oauth_token_secret = 'xxxxxxxxxxxxxx'
end

def tweet_general
  TweetStream::Client.new.track('opensource') do |status|   =>  Thiss Tweatstream client will keep tracking the keyword "opensource"
    if ( status.lang == 'en' )
      push(
                'id' => status[:id],
               'text' => status.text,
               'username' => status.user.screen_name,
               'userid' => status.user[:id],
               'name' => status.user.name,
               'profile_image_url' => status.user.profile_image_url,
               'received_at' => Time.new.to_i,
               'user_link' => "http://twitter.com/"
             )
        end
  end
end
def push(data)
      @db = Redis.new
      @db.lpush('tweets_general', data.to_json)         => LPUSHing Tweets into the RedisDB
    end

Redis DB

In Redis, im going to use the LIST data type. LIST are simply list of strings, sorted by insertion order. It is possible to add elements to a Redis List pushing new elements on the head (on the left) or on the tail (on the right) of the list.

So the new tweets i will be pushing in from the head, and prevent the over population, i will be calling LTRIM operation preiodically to clear of the old tweets from the database. All these operations are done from the Feeder by corresponding ruby methods/functions.

Sinatra Frontend

I’m using SINATRA for building the frontend. SInce i’m not much familiar with HTML’s and UI’s, i decided to use Twitter Bootstrap for building the layout. And for each category, i’ve created a collapsible table, so that we can expand and collapse the required tables. Now, the next task is scrolling the tweets, for that i found a jquery plugin called Totem Ticker.ANd i enabled, refreshing for the div element which contains this scroller, so that after each refresh, the variable which supplies tweets to the scroller, will get updated with the newer tweets from the corresponding Redis DB.

As of now the app is working fine. But i’m planning to extend this to add more features, like adding the keywords dynamically from the Web Frontend, and displaying only those tables with the keywords. I will be digging more take it more powerfull :-). I’m going to push the working code soon into my GitHub account, so that others can also paly around on this, and can extend it for their own requirements.

Standard
couchdb

CouchDB – a NoSQL DB With a Powerfull Rest API

CouchDB is an open source NoSQL database, which comes with a powerfull Rest API. It stores data in JSON format. uses Javascripts as its query language using MapReduce. JSON data’s are very easy to understand and can be parsed very easily. This time i got a requirement to build a commandline utility to display results for an internal monitoring tool. The tool is a bit old, and it does not have any api. And finding latest the result is consuming more time using the front end. But one advantage was the frontend can display results in HTML and XML. So curl was able to query the server using the url and it can display the xml output. The url is unique for each, as it composed of few details like Location, SerialNo, Domain, OS etc, which is unique for each hosts. So i decided to have a local database which contains all these unique host details. For this purpose, i dont need any relational databases, and NoSQL databases are sufficient. So i decided to use CouchDB, because it comes with a wonderfull rest api.

Since i’m a RUBY lover, i decided to use this couchdb ruby wrapper. SInce my requirement with CouchDB was pretty simple, this wrapper itself was sufficient for me. This wrapper can be used to perform any operations like, reading,writing,deleting Documents as well as Databases. This wrapper basically uses the ruby http library to talk to the CouchDB’s rest api. By default couchdb is available in most of the Linux respositories. for windows and mac, the files are available in the CouchDB website itself.

for example, for ubuntu/debian, below command will install couchdb server.

sudo apt-get install couchdb -y

By default, the couchDB package comes with a web frontend called “Futon” which can be accessed via browsers ugin the below url,

http://localhost:5984/_utils/

Below is the screenshot of the Futon frontend.

futon

We can perform all the operations through this web interface also. So i created a database, and created a document for each host with required fields. Now these documents can be accessed via the rest api in JSON format. so that my ruby script can fetch all the necessary data and can compose the exact url for fetching the check results for corresponding hosts.

Below is a sample read operation that we can perform through our ruby scripts. But before using the operation, we need to include the wrapper into our script. For that we need to add ‘require ./couch.rb’, we can use both relative or absolute path. once the wrapper is included,we can start performing operations with the CouchDB.

server = Couch::Server.new("localhost", "5984")
res = server.get("/foo/document_id")
json = res.body
puts json

The operation will read the document and will convert it into JSON using the json ruby library. Now i can use the JSON data to collect the required fields so that my script can form the exact fetch url for each host and can perform the result fetch. There is couchdb gem for ruby, which can perform more operations.

Apart from this, CouchDB has another important feature ”MAPREDUCE”, yes it can perform MapReduce, i’m trying to find if mapreduce can help me, so that i can extend my app to do more things. Once if i get any weird idea with MapReduce i will udapte this blog with that also.

Standard