Processing hardware running the ML model

Processing hardware running the ML model: a physical or virtual hardware platform, infrastructure executing the inference operations of the running trained model. Such a platform can consist of server clusters of CPU’s, GPU’s or specialized accelerators. GPU’s became a cornerstone for processing machine learning operations, as their parallel architecture accelerates matrix multiplications and linear algebra calculations, underpinning machine learning algorithms. As LLM’s require substantial computational resources, LLM’s are often trained with cloud-based servers fitted with GPU clusters. There may be many instances “*” of the processing hardware asset, dedicated to the processing of one machine learning model.

The system asset has the following defined methods:

  1. executeMachineLearningModelOperations() – a method, depicting the process of executing machine learning software operations for the purposes of inference.
  2. operateWithModelOperationalData() – based on the inference operations and conducted calculations – setup, update and utilize the temporary model’s operational data.
  3. storeOperationalData() – store the setup model’s operational data in RAM for further to enable further required inference operations.

The association between the system asset and the machine learning model operates with the following business asset: