The final phase, time to visualise.
Visualisation is not always the be all and end all of BigData but it is important in terms of telling the story.
At the end of the last post we had mined the data, got counts of all the hashtags and also whittled it down to three brands we wanted to focus on.
Visualising in D3
D3 (Data Driven Documents) is a Javascript based library and takes a lot of the work out of creating graphs, maps and all sorts of other things. From our perspective it’s handy as we can load in CSV/TSV files with ease.
Loading D3 is as simple as
<script src="http://d3js.org/d3.v3.min.js"></script>
I’m going to create a Pie Chart based on example on the D3 site. You can have a look at the HTML on my github repo for this blog post series.
Spring XD/Hadoop/D3 Considerations
In the four posts in this series we’ve covered data consumption, storage, processing and visualisation. With Spring XD it’s going to continue gathering data until we say stop. The Hadoop job we ran was a one off, there’s nothing to stop us putting these things in a cron job to update every hour. Though we do have to keep an eye on the time it’s taking to run those jobs.
Also remember D3 takes time to load and parse on the client side so don’t over run the user with too much information. From an aesthetic point of view too much information would be confusing for the reader.
As with all these things it’s a case of getting your hands dirty and trying things out.