YAHD – Yet another Hadoop Distro

Intel have introduced their own Hadoop distribution. So to go along with Cloudera, MapR and all the others out there there’s another one.  Is this a move away from hardware? Probably not, in my opinion but it does mark the start of larger scale integrated solutions.

Two issues that seem to be addressed: firstly a decent management console and secondly integration to more statistical based tools like R. Revolution Analytics have done a lot of work on the Hadoop/R scene so much so they have their own distro too. Cloudatics (he wrote this) wrote a workable web based front end so anyone could do Hadoop scale processing but without the technicalities.

intel_distribution_hadoop_stack

 

It looks like Intel are using the Hadoop platform to upsell their server hardware, storage and the all important maintenance contracts (like IBM do).

The self contained on demand on storage is important. You don’t want to waste bandwidth transferring data between your server and a Hadoop distributed file system.  The more centralised the processing and the storage is the quicker and cheaper it will work.

Above all in the next two or three years, while some are writing Hadoop off, it’s worth keeping a very close eye on Hadoop and R working together.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: