I spent most of Sunday morning working on some updates and a couple of corrections on my book, Machine Learning: Hands on for Developers and Technical Professionals. The comments and feedback all centred around a common theme.
Validating the Muse.
Interestingly a lot of the comments centred around decision trees which at least proves they are still popular but also proves that you can sit down with pen and paper and do some of the grunt work yourself to validate the model.
Now most people will load a dataset into something like Weka and let the system do all the work. And you know what, that’s okay, there’s nothing wrong with that. At the same time though I could with some effort work out the information gain, to find the potential root node in a tree, myself with a calculator and prove the model was good.
The same could be said for things like Apriori algorithms, Naive Bayes and Bayes Networks, Linear Regression, K-Means clustering and, at a push, Linear Support Vector Machines. If you have a pen, paper and a calculator you can start working things out.
When we get to neural networks that’s where things start to get hazy, really hazy.
The Media’s Love Affair with Neural Nets
Oh boy the tech press like a good ANN story. Whether it be the Deep Mind team beating Lee Se Dol in three games of Go or IBM’s Watson doing the whole Jeopardy thing or Google’s self driving car. They are all, without doubt, sexy AI stories that are going to generate discussion and debate. And the joys of debate is that it rises up polar opposites of public opinion, like it loathe it. It’s going to help us or it’s going to destroy us.
While the core concepts of neural networks are simple from the idea of perceptron weights to activation functions, they’re quite easy to grasp. The problems arise once the models have been created, the maths can become so black box like they are difficult to prove or write off. One thing’s for sure, over time and with enough iterations it will get better.
Like I said in the book, “One of the keys to understanding the artificial neural network is knowing that the application of the model implies you’re not exactly sure of the relationship of the input and output nodes. You might have a hunch, but you don’t know for sure. The simple fact of the matter is, if you did know this, then you’d be using another machine learning algorithm.”
Though there are good books on the subject the models themselves are always difficult to prove, with enough training you’ll get results. Even the publications which I hold in high esteem such as “Data Mining” by Wittan, Frank and Hall merely skirt around the mechanics of Neural Nets which, to be fair, made me feel a whole lot better when writing the book at the time.
Artificial Intelligence as Frameworks
What I believe we are seeing the chasm point of artificial intelligence frameworks. Some have been around for a while, Weka and RapidMiner for instance, and there are others that are new on the scene such as TensorFlow. The common thread though is that they are provide a starting point for machine learning and AI to the mass market developer.
It’s very much like the web frameworks of the Web 2.0 era of websites. The main tipping point was Ruby on Rails which obscured a lot of the hard work that was going on under the covers. This led to a plethora of web frameworks in a variety of languages where you really didn’t need to know what was going on technically, it was just a case of downloading, setting a few things up and then going through the motions of creating the objects you needed. There came a time where it was more important to know how to get the framework working than the underlying language that was doing all the work. I believe we’re at the same point with artificial intelligence.
Data Velocity + AI + Lack of Algorithmic Knowledge = Concern
While some of the machine learning algorithms have been around for forty plus years it’s only recently they’ve become in vogue due to computing processing power, vast amounts of data generated and the need for corporations to push the bottom line down to keep stake holders happy over the long term.
As we push vast AI knowledge into the hands of developers who may have no prior knowledge on how the stuff really works, is this a good idea? I don’t believe so and any corporation saying “we need to do data science” to a team that doesn’t know what they are doing is commercial suicide in my eyes.
If you look at Google and Tesla they’ve been analysing data over a long period of time. They’ve got the right people involved whether that be developers, quants and the hardcore maths folk in measure, refine and deliver. Even then it goes wrong. The first self driving caused crash, well it was bound to happen at some point. You’re working on a probability and regardless of the odds it can, and will, at some point go the way you weren’t expecting. The point with AI is though, you’re not delving into the algorithm to tweak it at that stage, if you do that then what is the knock on effect of the change? You don’t really know. So far down the line all you can do is provide more data for the algorithm to learn off.
Basically you’re left with un-answered questions but you just have to go with the system because the algorithm says so.
AI and Machine Learning Costs Money
To be done right these technologies take time to develop and deliver. They also need repeat testing to ensure that the application is behaving as you’d expect. All fine with supervised learning as you’ve already defined the outcomes of the training data you have. Unsupervised learning comes with it’s own set of issues that need to be closely looked at before it’s deployed to the real world.
While any developer can download these tools and use them I still firmly believe it’s vitally important to have a knowledge of how these algorithms work. It’s not the easiest thing in the world to do either. I’d rather have an explanation of what happened rather than shrugging my shoulders with a blank look on my face.
The plane crashed because the algorithm did it is just not a reasonable excuse.