AutoML Current Uses and Approaches

24 Jul 2022

01 What is AutoML and how does it work?

02 Why is automated machine learning important?

03 What are approaches to AutoML?

04 AutoML Challenges and pitfalls

05 Will AutoML replace data scientists?

06 The future of AutoML

Machine learning (ML) has made significant strides in recent years, and an ever-increasing number of disciplines rely on it. However, this success is critically dependent on specialists training humans and machines to perform various tasks. One of the biggest challenges for organizations today is the lack of expertise in ML among employees (despite the emergence of so-called citizen data scientists).

For the algorithm to work smoothly, users of ML tools have to make several choices. How the data should be processed, which features should be used for ML, which algorithms should be chosen, how models should be tuned and refined, and how models should be deployed. This can be overwhelming for a beginner.

Even though the novice specialist may have strong analytical skills, the whole process of machine learning can be intimidating, and businesses end up with inefficient and, in the worst case, incorrect models. This brings us to the concept of automated machine learning (AutoML).

What is AutoML and how does it work?

The term AutoML has quite a few definitions, but I will give you one of the simplest. So, in simple words, it is a process of automating the tasks of developing a machine learning model. AutoML allows analysts and developers to create machine learning models with high efficiency and productivity while maintaining the quality of the model.

ML experts must perform every step in the data pipeline prototype (for example, data preprocessing, feature engineering, and hyperparameter optimization) manually. Implementing AutoML simplifies the development process by allowing a few lines of code to generate the code needed to start developing the ML model. Moreover, AutoML provides greater access to artificial intelligence (AI) development for those who do not currently have the background required for data science roles.

AutoML Current Uses and Approaches - photo 1

Automating ML processes opens the door for companies with limited resources to fully invest in AI. While there is still much to be done to fully automate ML processes, companies create promising tools for further development in this area.

AutoML finds and uses the optimal type of ML algorithm for a given task. This is done through two concepts:

Neural architecture search that automates the design of neural networks. This helps AutoML models discover new architectures for problems that need them.
Transfer learning, in which pre-trained models apply learned knowledge to new datasets. Transfer learning helps AutoML apply existing architectures to new tasks that require it.

Why is automated machine learning important?

The current process of building a machine learning model usually requires highly skilled technical experts, a long development process, a lot of money, and many iterations. In comparison, the AutoML approach provides a lot of benefits.

AutoML Current Uses and Approaches - photo 2

Bridging the skills gap. With AutoML, ML becomes accessible to non-specialists. As a result, companies do not need to hire employees for highly specialized positions, which accelerates the implementation of innovations and, ultimately, ML adoption.

Fast market entry. It provides a significant competitive advantage. Automating aspects of the ML pipeline reduces the time it takes for humans to build models. Companies that have never used AI before will find it easier to enter the market and create a successful solution.

Savings on creating models. Building ML models from scratch takes a lot of time and a lot of money. AutoML tools are far more affordable than the investment in skill and effort required to build a model from scratch.

Creating better models. AutoML iterates over models and hyperparameters faster than manually. This improves the efficiency of the decision-making process and speeds up model exploration.

AutoML also allows more efficient use of autonomy in business processes:

For non-specialists who may have a good business understanding but limited knowledge of ML, using AutoML tools can help reduce technical barriers by guiding them through the ML process and opening up new possibilities.
Using AutoML tools allows you to automate aspects of the ML process for an expert. This gives them more time to focus on other areas such as interpreting the output of the ML process, exchanging ideas and doing more complex forms of analytical work, etc.

What are approaches to AutoML?

Today, there are several ways to define automation in machine learning. One of them is the AutoML classification by levels, similar to the classification of autonomous vehicles.

Level 0: No automation. Data scientists create algorithms from scratch.
Level 1: Use of high-level APIs.
Level 2: Automatic hyperparameter tuning and model selection.
Level 3: Automatic feature engineering, feature selection, and data augmentation.
Level 4: Automatic domain and problem-specific feature engineering, data augmentation, and data integration.
Level 5: Full automation. No input or guidance is required to solve ML problems.

Most companies implementing AutoML into their workflow fall into levels 1-2. However, there are now level 3 solutions on the market. Within these levels of automation, there are several approaches to AutoML that are worth highlighting.

Model selection and assembly

AutoML can choose the model that works best. The technology needs to iterate through various algorithms trained on the same input data to do this. For the best result, it can even combine multiple models into one, which is often done through techniques such as blending and overlaying.

Hyperparameter Optimization (HPO)

All ML algorithms have parameters that are derived from the learning process, and a hyperparameter is an adjustable value used to control the learning process. HPO refers to the tuning of hyperparameters to improve model performance. AutoML tools can automatically evaluate various hyperparameters to determine the set that results in the most efficient model.

Feature Development

It is the construction of new input functions (or independent variables) from existing input data. This affects model performance by highlighting important elements that your model needs to know and understand when making predictions. With AutoML tools, this process can be done automatically. AutoML tools extract relevant and meaningful features from a given set of inputs and test different combinations of features to create the most efficient model.

AutoML Current Uses

Several off-the-shelf packages have been developed to provide automated machine learning in recent years. In this paragraph, we highlight the most famous.

H2O AutoML allows non-specialists to apply ML to detect financial fraud. Many issues need to be addressed at the transaction, account, and network-level to detect fraudulent behavior and suspicious activity.

The Cardea system was developed by scientists at the Massachusetts Institute of Technology, using several ML tools in healthcare. This system is a powerful reference for problem solvers in hospitals. Cardea is designed to work with Fast Healthcare Interoperability Resources (FHIR), the benchmark for electronic health records.

Cloud AutoML uses a neural network architecture. This Google product has a simple user interface for exploring and deploying models. However, the platform is paid, and it only makes sense to use it in the long term in commercial projects. On the other hand, limited Cloud AutoML is available for free for research purposes throughout the year.

TransmogrifAI is an AutoML structured data library written in Scala that runs on top of Apache Spark. It was designed to improve the productivity of ML developers with AutoML and an API that provides reusability, modularity, compile-time type safety, and transparency.

Another AutoML tool is AutoGluon. It uses only one line of Python code to train accurate models on raw tabular datasets like CSV files. While other AutoML frameworks focus on hyperparameter selection, AutoGluon achieves the goal by assembling multiple models and placing them in different layers. With AutoGluon, you are also able to perform classification, detection, NLP tasks, etc.

AutoML Challenges and pitfalls

The main problem with AutoML is the temptation to see it as a replacement for human knowledge. Like most automation tools, AutoML is designed to efficiently perform routine tasks, allowing employees to focus on more complex or newer tasks. The things AutoML automates, such as monitoring, analysis, and problem detection, are mechanical tasks that run faster when automated. The human still has to participate in the evaluation and control of the model, but he no longer needs to participate step by step in the machine learning process. AutoML should help improve the efficiency of data scientists and employees, not replace them.

Another problem is that AutoML is a relatively new area, and not all levels of this technology are fully developed.

Will AutoML replace data scientists?

Although AutoML is developing at a fast pace today, this technology is not yet able to understand what explicit information means for an organization, its business, and its business context. In addition, it is also not possible to mechanize knowledge of the subject area now. Therefore, while the requirements for mathematicians have expanded significantly because robust hypotheses can be applied to their calculations, AutoML is likely not to replace data scientists.

Whether or not AutoML can create any ML model, statistical models are not without flaws. This is where experts figure out how to sketch a model with the goal of making the model fit the task well. AutoML is now designed to speed up their work, allow them to try things quickly, and help them improve their results.

The future of AutoML

AutoML technologies still have difficulties processing complex raw data and optimizing the process of constructing new features (feature engineering). For this reason, the selection of significant features remains one of the cornerstones of the model learning process. The industry still has a long way to go before ever reaching Level 5, a fully automated solution.

However, today’s AutoML is a way to mechanize the cycle of applying AI to certifiable problems from start to finish. The wide range of advances in the middle can be easily robotized, passing a model that is well improved and ready to make predictions.

As automation demands grow and tooling improves, ML adoption will likewise increase as building machine learning becomes more approachable and less resource-intensive. Postindustria’s team of ML engineers can help you drive business value by embedding automation and ML algorithms into any of your projects. Book a call with us and let’s discuss the details!