Machine Learning (ML) is the science of how computers can improve their perception, cognition, and action with experience; in other words, ML teaches computers to learn from experience by using adaptive algorithms combined with computational methods to learn information directly from data without relying on code as a model. Machine Learning is about how computers can act by themselves without being explicitly programmed. As a field of Artificial Intelligence (AI), ML improves computers' performance from data, knowledge, experience, and interaction.
Machine Learning started with two breakthroughs:
1. The Arthur Samuel's pioneering work on computer gaming and AI, that made possible computers to learn from themselves instead of instructing them everything they need to know and how to do tasks.
2. The Internet growth of the past decade; making available huge amounts of digital information for analysis.
Engineers realized that it was far more efficient to code computers to think and understand the world like humans, giving access to all the information available on the internet and letting them to learn; keeping the innate advantages computers hold over humans: speed, accuracy, or lack of bias.
Machine Learning can only happen using Neural Networks —computer systems designed to recognize and to classify information as a human brain does. A Neural Network essentially works on probabilities, making statements, decisions, or predictions with a degree of certainty based on a data feed. By adding a feedback loop, the computer can modify its future approach after being told or sensing whether its decision was right or wrong, allowing the “learning”.
ML helps to generate insights, to make better decisions, and to develop predictions. As computers outperform humans on counting, calculating, following logical yes/no algorithms, and finding patterns, Machine Learning is recommended when having a complex task or problem involving a large amount of data and lots of variables, but no existing formula or equation.
Machine Learning nowadays has become relevant in these areas:
- Automotive: self-driving cars, motion and object detection, predictive maintenance...
- Data security: cloud access patterns, anomaly identification, security breaches prediction...
- Personal Security: ID processing, security screenings, face recognition...
- Finance: stock market predictions, pricing and load forecasting, credit scoring, fraud detection...
- Healthcare: tumor detection, drug discovery, DNA sequencing, human genome mapping...
- Natural Language Processing (NLP): speech recognition, translation...
- Marketing: profile personalization, recommendations, online search...
Machine Learning uses two types of techniques:
1. Supervised Learning: It makes predictions based on evidence in the presence of unknown. Supervised Learning takes a known set of input data and known responses to the output data and trains a model to generate reasonable predictions for the response to new data. The inner relations of the processed data can be uncertain, but the output from the model is known.
Supervised Learning uses these two techniques to develop predictive models:
- Classification: it anticipates discrete responses. Its use is recommended when the data can be tagged, categorized, or separated into specific groups or classes.
Common Classification algorithms are support vector machine (SVM), boosted and bagged decision trees, k-nearest neighbor, Naïve Bayes, discriminant analysis, logistic regression, and neural networks.
- Regression: it anticipates continuous responses. Can be used when working with a data range or if the nature of the response is a real number.
Common Regression algorithms are linear/nonlinear models, regularization, stepwise regression, boosted and bagged decision trees, neural networks, and adaptive neuro-fuzzy learning.
2. Unsupervised Learning: It finds intrinsic structures or hidden patterns in data. Unsupervised Learning is used to draw inferences from datasets consisting of input data without labeled responses. Even the result can be unknown, might be relations with the processed data but it is too complex to guess (once normalized, the algorithm itself may suggest ways to categorize the data).
Clustering is the most common unsupervised learning technique. It is used for exploratory data analysis to find hidden patterns or data groupings.
Common Clustering algorithms are k-means and k-medoids, hierarchical clustering, Gaussian mixture models, hidden Markov models, self-organizing maps, fuzzy c-means clustering, and subtractive clustering.
The most common challenges related to Machine Learning are associated with:
- Data management: as data can be incomplete, multi-formatted, or with several shapes and sizes. Different types of data require different approaches and specialized knowledge and tools.
- Data model: it has to fit the data. Flexible models overfit data by modeling minor variations that can cause noise, while simple models might assume much.
Dozens of supervised and unsupervised ML algorithms are available out there with different approaches to learning; there is no such thing as "best method". Finding the right model to fit the data takes time, as there are always tradeoffs between model speed, accuracy, and complexity. Machine Learning implies a constant iteration and try out of ideas and approaches; selecting the appropriate algorithm is partly trial-and-error.
Selecting an algorithm to apply Machine Learning relies on:
- The type of data to work with.
- The expected insights at the output.
- The application of those insights.
Machine Learning requires specific tools to ease the handle and analysis of the extremely large amount of data (usually known as Big Data) needed to reveal patterns, trends, and associations. One of the most common and robust tools to apply ML to data analytics is MATLAB: it is able to manage and use big data, and also has some useful prebuilt functions, extensive toolboxes, and specialized apps to apply supervised learning (classification and regression) and unsupervised learning (clustering). Check how they do Machine Learning Made Easy.
It is evident that Machine Learning is impacting several industries; helping progress across many fields, proving itself to be a transformative technology with unlimited future potential applications, using the Internet of Things (IoT) to perceive, understand, smartly react to the world, and learn from that experience.