I’ve had the honour to be a beta tester for RepKnight for a long time but never had a real deep dive into the features and the searching.
So today seems like a better excuse than any to do something with it. The nice thing with looking for our specific data from the firehose in general. In this case the hashtag #mtvema is what we’re after. I’m not using any major features of the RepKnight par se, just the fact I can dump the comment data out in a csv file. To be fair RepKnight is saving my butt in terms of collating huge amounts of data for me. I love it.
Next stage will be to run a Hadoop cluster and find the counts on the unique words and other boffinry. It’s going to be a fun week of deep diving the data. Updates as they happen.
I wanted to have a proper look at the Twitter Storm API as there was an excellent demo in regards to the Emmy awards that happened not so long back. It relies on quite a few prerequisites to get everything up and running where as I already have all my hadoop things ready here.
The next few days are going to be fun…. trust me.