Using data to craft email #marketing messages. #hadoop

starbucks Let’s start right out, hey wha’ happen? Starbucks I have three words in response, “I hate chocolate”. Right now to the main feature.

All Encompassing Email Marketing

It’s clear that this is an all in email to all Starbucks loyalty card members but the simple fact of the matter remains that they have more than enough data to tell them that I don’t buy chocolate products. It’s all in the POS data and even more a crime against “digital marketing” is that with all the big data tools at their disposal it would have been easy enough to work out. (Look even closer and they might have noticed that I stopped drinking coffee since May 2014 as well….. not by choice)

You Have Data!

I’m not just directing this at Starbucks but everyone whether it be bricks and mortar or e-commerce, you have data. You may not have the skills to deal with it. I’ve generated a fake POS log, I’ve kept it simple to one of the following types of beverage:

  • Americano
  • Flat White
  • Tea
  • Latte
  • Capuccino

coffeefile And I’ve got 100,000 transactions.

Jason-Bells-MacBook-Pro:~ Jason$ wc coffee.csv 
 100000 124795 774647 coffee.csv
Jason-Bells-MacBook-Pro:~ Jason$

Now with this all being from one customer it’s quite easy to, the only thing to do is mine that data and look at the frequency of the beverage types. Find the one that scores the most (this data was randomly generated for this post) and then make a decision about how to market to them. One thing I could do really quickly is use Hadoop without doing any coding I can find the word frequencies really easily.

Jason-Bells-MacBook-Pro:~ Jason$ /usr/local/hadoop-1.2.1/bin/hadoop jar /usr/local/hadoop-1.2.1/hadoop-examples-1.2.1.jar wordcount coffee.csv coffeeoutput

With small data it takes no time at all and you don’t need Amazon web services or Azure for this sort of thing, you can do it from your own machine.

15/02/13 10:06:25 INFO mapred.JobClient: Map output records=100000
15/02/13 10:06:25 INFO mapred.JobClient: Combine input records=100000
15/02/13 10:06:25 INFO mapred.JobClient: Reduce input records=5

The results speak for themselves we know this customer clearly buys tea….

Americano 8325
Capuccino 8391
Flat_White 8499
Latte 8324
Tea 66461

So is there much point in running a promotion on Americano’s to this customer when it’s clear they buy 8x more tea? How about a variation on tea, a new blend for example? What’s going to pull the customer in?

Too Many Missed Opportunities

Now don’t get me wrong, me not liking chocolate classes me as an outlier (here I am on the far left of the curve…. -3SD from the mean) normal_distribution_500x263 Perhaps there’s an argument to segment the outliers and market them a different message? 68% of the customers will happily accept the original message but what’s the profit potential sending a different message to the others…..

To Starbucks

Three little words: I like tea.


Pursue this idea further in your own business: Jason Bell is a Data/Hadoop consultant based in Northern Ireland but helps companies globally with various BigData, Hadoop and Spark projects. He also offers training on Hadoop, the Hadoop Ecosystem and Spark to developers and anyone interested in what these technologies can do. He’s also the author of “Machine Learning – Hands On For Developers and Technical Professionals“.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: