While putting the recommendation engine demo together over the last few days my thoughts turned to exporting data out of some data store into CSV.
The original demo uses MySQL to store transaction data, nothing wrong with that I know. And I put a small script and sql command to export the data out.
select * into outfile “/tmp/recom.csv” fields terminated by “,” lines terminated by “n” from recom;
All well and good. Until it comes to file permissions. MySQL will write to the /tmp directory no problems. It will complain if the file already exists.
You could use Sqoop as an alternative which is a handy little tool that’s used with all the BigData tools kicking around.
sqoop import –connect jdbc:mysql://localhost/mydb –username auser –table recom –as-textfile
You can also extend it to run specific queries to pull out certain parts of data and selective runs since specific row writes and so on. A better solution in an incremental fashion.