Relationship of attacks and input data embeddings asset

General Image Description

The UML class diagram visualizes a threat model with 1 threat, determined from the conducted systematic literature review, which targets the input data embeddings for the initial compromise. The compromise of the input data embeddings is conducted through “Supporting IT infrastructure”.

Input data embeddings: numerical vector representation of the input textual data, created either during the training or inference processes. This representation allows machine learning models, including LLM’s, to conduct defined calculations on the data, determine semantic relationships and calculate an inference. These embeddings can be stored with the supporting IT infrastructure for later re-use. The stored embedding may be utilized at later stages for model fine-tuning or enhancement of context of new input queries.

List of threats

  1. [IDE.T.1] An Embedding Inversion Attack exploits vulnerabilities to invert embeddings and recover significant amounts of source information, compromising data confidentiality.