Hardware Trojan attack

Definition

[MOD.T.1] A Hardware Trojan (HT) attack is a type of attack where malicious circuitry is covertly inserted into the hardware that implements a machine learning model. This hidden circuitry, once triggered, can manipulate the model's behavior, leading to misclassification, data leakage, or other security breaches. Unlike software-based attacks, HT attacks directly target the hardware to alter the model's parameters and computation results.

Targeted assets

System Asset: processing hardware running the ML model.

Business Asset: model's operational data.

Security Criteria: confidentiality, integrity, availability.

Attack details

Exploited vulnerabilities

Vulnerabilities:

  1. The state of machine learning model's operational data can be influenced through a hardware trojan inserted into the integrated circuits.

Threat agent

Threat agent: black-box scenario. In a black-box scenario, the attacker has no knowledge of the target model's architecture, parameters, or training data. The attacker is assumed to be only able to interact with the model by sending it inputs and observing the outputs.

Attack methods

Attack methods:

  1. A hardware trojan is embedded into an integrated circuit. The embedded hardware trojan modifies the chain of operations based of a trigger, thus modifying the expected chain of operations during machine learning operation.

Impact and harm

Impact and harm: Negates the confidentiality, integrity and availability of the targeted machine learning model. This may lead to misclassification of benign and malicious inputs, degraded performance or private data leakage.

Security countermeasures

Security requirements

Security requirement: The machine learning system must be resistant to malicious hardware trojan attacks.

Security controls

Security controls:

  1. MAC (Multiply-and-Accumulate) operations, low-precision data representations, bound-constrained dynamic range compression: limit error propagation and aggregation; quantizing data to lower bit precision reduces the proportion of vulnerable parameters. Activation output clipping.
  2. Hardware root-of-trust: safeguarding DNN IP cores and private data. An obfuscation framework employing key-dependent backpropagation algorithm to lock some neurons of the model; on-chip key can recover the correct functionality of the model.
  3. Binarization method: mimic bit-flip noise on the weights, thus increasing robustness against bit-flip attacks.
  4. Piece-wise clustering method: adds fixed single bit-width constraint during the training, this increasing robustness against bit-flip attacks.
  5. Weight reconstruction: averages errors over a grain of weights with their quantization and clipping, thus increasing robustness against bit-flip attacks.
  6. Defensive quantization: constrains the Lipschitz constant during training to limit mapping sensitivity.
  7. Hardware with Triple Modular Redundancy (TMR): three copies of the functional circuits are present, majority vote determines correction and masking of faults in copied; imposes higher energy an resource overhead.
  8. DNN (Deep Neural Networks) accelerator: tolerant to SRAM read faults from voltage variations.
  9. Word masking and bit masking: round faulty bits to zero; a whole word reset to zero or flipped bits are rest to zero respectively.
  10. TE-Drop: an error-tolerant design for the MAC units, for example utilizing Razor flip flops module for active fault detection; the detected error is dropped. Hardening of selective memory cells. Application of modular redundancy on sensitive weights. Possible direction - explainable AI, if both inductive and deductive reasonings are incorporated together, this could reduce frequency of logical fallacies.
  11. Trusted Inference Engine (TIE): Pseudo Random Number Generators (PRNG) and PUF (Physically Unclonable Function) are utilized to decrypt the encrypted machine learning model, stored on off-chip memory. A DNN accelerator with a memory encryption engine, encrypting data in DRAM; also utilizing Integrity Verification (IV) engine for detection of unauthorized operations on the data from the external memory; comes with low overhead (this defense does not account for hardware side-channels). Possible direction - explainable AI, if both inductive and deductive reasonings are incorporated together, this could reduce frequency of logical fallacies.