AI/ML systems adversarial threat modeling

Running AI/ML systems in production expose companies to a brand-new class of attacks. The following attacks should be considered:

  • Model stealing (extraction): Adversaries can recreate a model that could be used offline to craft evasion attacks or just as it is.
  • Model tricking (evasion): Adversaries can modify model queries to get a desired response.
  • Training data recovery (inversion): Private training data can be recovered from the model.
  • Model contamination to cause targeted or indiscriminate failures (poisoning): Adversaries can corrupt training data to misclassify specific examples (targeted) or to make the system unavailable (indiscriminate).
  • Attacking ML supply chain: Adversaries can poison third-party data and models.

These attacks should be carefully considered. For information on adversarial attacks and how to include them in the threat modeling process, follow the links below:

Failure Modes in ML | Microsoft Docs

Threat Modeling AI/ML Documentation | Microsoft Docs