ML-TOSCA: ML pipeline modelling and orchestration using TOSCA

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Tartu Ülikool

Abstract

In today’s world, machine learning is increasingly involved in different areas. Moreover, automating machine learning workflows through AutoML enables organizations to develop and deploy machine learning solutions at scale rapidly. Additionally, leveraging the power of cloud computing can provide even greater scalability and flexibility, allowing us to efficiently process large datasets and cost-effectively train and implement complex machine learning models. Undoubtedly, these technologies will play an essential role in shaping the future across various industries. Despite many advantages, there is a lack of widespread combined implementations of AutoML and cloud-based solutions. This thesis describes a new AutoML integration approach to the TOSCA standard. TOSCA is an open-source specification used to describe the topology of cloud applications and services. Incorporating AutoML techniques into TOSCA enables users to automatically generate optimized machine learning models with the help of cloud applications, which can improve the speed and efficiency of model creation. The proposed approach is implemented in the RADON ecosystem, allowing node and relationship types to be created. The final solution allows users to create and join blocks to define a complete machine learning pipeline structure.

Description

Keywords

AutoML, TOSCA, ML-TOSCA, ML pipeline, Pipeline

Citation