Deep delving in to RepKnight with MTV data


I’ve had the honour to be a beta tester for RepKnight for a long time but never had a real deep dive into the features and the searching.  

So today seems like a better excuse than any to do something with it. The nice thing with looking for our specific data from the firehose in general. In this case the hashtag #mtvema is what we’re after.  I’m not using any major features of the RepKnight par se, just the fact I can dump the comment data out in a csv file. To be fair RepKnight is saving my butt in terms of collating huge amounts of data for me. I love it.

Next stage will be to run a Hadoop cluster and find the counts on the unique words and other boffinry. It’s going to be a fun week of deep diving the data.  Updates as they happen.

I wanted to have a proper look at the Twitter Storm API as there was an excellent demo in regards to the Emmy awards that happened not so long back.  It relies on quite a few prerequisites to get everything up and running where as I already have all my hadoop things ready here.

The next few days are going to be fun…. trust me.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: