Property inference attack

Definition

[IT.T.13] A property inference attack is a type of privacy attack that aims to infer confidential information or attributes about the training dataset used to train a machine learning model. The attacker attempts to determine certain properties or characteristics that are present in the training data, which the model provider does not want to reveal. This attack doesn't directly manipulate the model but extracts private information without disrupting the model’s normal training process.

Targeted assets

System Asset: ML system input/API.

Business Asset: input data.

Security Criteria: confidentiality.

Attack details

Exploited vulnerabilities

Vulnerabilities:

  1. Model tends to memorize its training data in case of over-parametrization. It is possible to learn additional information about the training data sample from the target model's output.

Threat agent

Threat agent: white-box and black-box scenarios. In the white-box scenario, the attacker is assumed to have complete knowledge of the target machine learning model, its architecture, parameters, utilized training data, and the learning algorithm. In a black-box scenario, the attacker has no knowledge of the target model's architecture, parameters, or training data. The attacker is assumed to be only able to interact with the model by sending it inputs and observing the outputs.

Attack methods

Attack methods:

  1. Target model's output and parameters are analyzed to determine properties of the utilized training dataset.

Impact and harm

Impact and harm: Negates the confidentiality of the targeted machine learning model. This may lead to private data leakage and possible legal repercussions.

Security countermeasures

Security requirements

Security requirement: The machine learning system must be resistant to malicious property inference attacks.

Security controls

Security controls:

  1. Differential privacy: addition of noise to deviate the outputs from the original.
  2. Secure multi-party computation: joint computations are conducted within confidential environment.
  3. Homomorphic encryption: calculations are conducted through confidential means, allowing operations on encrypted data without revealing the original data.
  4. Adversarial machine learning: incorporation of data about adversarial techniques into the model's training process.
  5. Vulnerability detection: risk assessment method for machine learning mode, evaluation of the models pre-release.