Definition
[MP.T.2] Model extraction attack, utilizing side-channel attack methods, leverages hardware implementation vulnerabilities to determine target machine learning model’s architecture and parameter values.
Targeted assets
System Asset: processing hardware running the ML model.
Business Asset: model's parameters.
Security Criteria: confidentiality.
Attack details
Exploited vulnerabilities
Vulnerabilities:
- The state of machine learning model's operational component data can be inferred through measurable attributes of the machine learning, such as power consumption, EM emission, memory access pattern.
Threat agent
Threat agent: gray-box scenario. In a gray-box scenario, the threat agent is assumed to have some partial knowledge of the target model. This knowledge could include the type of learning algorithm used, the feature set, or the agent may possess an access to a surrogate dataset with similar characteristics to the original training data.
Attack methods
Attack methods:
- A malicious fault is introduced into the operations of hardware, processing target machine learning model's operations.
Impact and harm
Impact and harm: Negates the confidentiality of the targeted machine learning model. This may lead to the intellectual property theft.
Security countermeasures
Security requirements
Security requirement: The machine learning system must be resistant to model extraction attacks.
Security controls
Security controls:
- Differential privacy: addition of noise to deviate the outputs from the original.
- Secure multi-party computation: joint computations are conducted within confidential environment.
- Homomorphic encryption: calculations are conducted through confidential means, allowing operations on encrypted data without revealing the original data.
- Adversarial machine learning: incorporation of data about adversarial techniques into the model's training process.
- Watermarking techniques: embedding of watermarks into model's parameters, or algorithmic analysis of the model due to over-parametrization, or entanglement between watermark and training data features.
- Vulnerability detection: risk assessment method for machine learning mode, evolution of the models pre-release.