OK, so why is machine learning getting so much attention and what is machine learning anyway? How can machine learning help or reshape business? Remember how the internet reshaped traditional vendor business? Machine learning will do the same thing to current business.
For example, imagine your laptop sends you a message that one of your hard drives will fail in 10 days and the laptop company already sent you a brand new one to help you backup your files. How cool is that? This can extend to any machinery industry; they call it predictive maintenance. Another example is auto driving, Google, Tesla and other car companies are working on that. They are building a computer vision system can “see” and that is hard part, the automatically driving part is fairly easy compared to the “seeing” part.
Machine learning, deep learning, Artificial intelligence (AI), data mining, pattern recognition, supervised and un-supervised learning. People hear this from media or conferences, there is some relationship between these concepts
AI is a more general term; it means that the algorithms and efforts making machine can talk, see, move, play and think like human being. As of today, no one claims that they make a machine think like human being, however, there are algorithms enable machines to talk, see, play games, and move around.
Machine learning consists of algorithms that can learn from data to help machines perform those tasks and it is sub category of AI. Deep learning is another sub category of machine learning, it specifically deals with neural network algorithms. Data mining and pattern recognition can be used similarly but data mining focuses on finding insights among the data. Pattern recognition often implies computer vision problems by finding interesting patterns from images or videos.
Supervised and un-supervised learning are two different learning styles adopted by machine learning algorithms. Supervised learning means you have labels for your training data and unsupervised learning means you don’t have labels. For the above hard drive example, if you don’t know which hard drive failed or didn’t fail in your dataset, then you cannot use the supervised learning algorithm. You might want to look for un-supervised learning algorithms. Most deep learning algorithms fall into the category of supervised learning. However, if you don’t have labelled data, too bad, you cannot use them. Either you have create them manually or you have to start collecting label data.
How does machine learning algorithms solve those tasks? Let’s take the hard drive example and make it extremely simple; Imagine you have captured one measurement: temperature of the hard drive. Let’s also assume the measurement is enough to predict hard drive failure and that the hard drive only has two stages: failing and healthy. Also we assume the function mapping between hard drive stages and measurement is a simple linear equation:Y= a*X + b
Now this task become an easy task. We collect the history data of every hard drive and put their stages in Y since there are only two stages we use 0 and 1 to represent healthy and failing respectively. We put temperature in X with many data points we collected we estimated a and b using minimized squared error or line fitting. This is called training in machine learning. Once you estimated a and b, for any new temperature X you can calculate a Y and this is your prediction. What we described here is a supervised learning example, but there are other types of machine learning algorithms don’t require label data that you can train without label such as KNN. This is simplified so we can make the discussion easy to understand. You may wonder why this is so difficult and where is the difficult part.
Let’s make it harder. What happens if your measurements are not just one temperature, but instead you have 2000 different measurements captured? Will you use all of them to train? Will you use those measurements values directly? Will you transform those values or combine them? To answer these questions and make data driven decisions, data scientists need to perform a very time consuming task: feature engineering. Most of a data scientist’s time is spent on this task.
How do you ensure your function mapping about your hard drive stages and measurements are linear could be non-linear? This question is usually for machine learning researchers not for data scientists. When they develop a machine learning algorithm they have to consider that, but data scientists just need to pick a correct algorithm based on their knowledge of the problem.
Last, how do you make sure your model (the estimated a and b) is a true representation of the relationship of features (measurement) and labels (hard drive stages)? Is this model close enough and general enough? Typically the “close enough” problem is what data scientists focused on. They are trying to minimize the error rate of the model. The model being general enough is a different problem. We know we don’t have all the data about the problem, so we don’t want to over-fit the data to the model. The model would behave perfectly with historical data, but wouldn’t respond well with new data. These are general problems in machine learning field.
Deep learning (neural network) changed the way that data scientists work. It has the capability to perform auto feature engineering. This means that data scientists no longer have to do manual feature engineering as long as they have a large training dataset. Deep learning was developed over 30 years ago; why has it become so hot now?
Back in 1980s there was no computing power and not enough datasets for neural networks to shine. With the huge growth of data and computing power, deep learning is starting to out-perform traditional machine learning in many fields such as image recognition, NLP, language translating, image classification, voice recognition, etc.
There is a very interesting coincidence involving deep learning. Deep learning is developed by mimicking how brain cells works. The section of the brain it is trying to mimic is the part where visual information (images) is processed. Deep learning has started to gain people’s attention ever since Alex introduced the convolution deep neural network algorithm in the ImageNet competition, an image classification challenge. What a coincidence!
Deep learning is great but it comes with some drawbacks. First, it is hard to explain. Since the algorithm discovered and selected the feature by itself, it is really hard for data scientists to explain what is going on and how it made this prediction or classification. Second, tuning the network is an art, not a scientific task. Finally, you need to have a large amount of data If you don’t deep learning cannot perform as well as traditional machine learning algorithms.
Will computers eventually have self-awareness and real intelligence? Maybe, but not soon; not until we figure out how human beings have self-awareness and intelligence. Will computers eventually perform better than human beings in some tasks and replace human workers? Yes, it is already happening and the pace will pick up quickly before we even notice. Similarly, in the same way that machines replaced factory workers, machines and computers will replace workforce experts such as lawyers, doctors, drivers, programmers and teachers etc.