IndirectAttackEvaluator Class

Note

This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.

Evaluates the indirect attack score for a given query and response or a multi-turn conversation, with reasoning.

Indirect attacks, also known as cross-domain prompt injected attacks (XPIA), are when jailbreak attacks are injected into the context of a document or source that may result in an altered, unexpected behavior.

Indirect attacks evaluations are broken down into three subcategories:

Manipulated Content: This category involves commands that aim to alter or fabricate information, often

to mislead or deceive. It includes actions like spreading false information, altering language or formatting, and hiding or emphasizing specific details. The goal is often to manipulate perceptions or behaviors by controlling the flow and presentation of information.
Intrusion: This category encompasses commands that attempt to breach systems, gain unauthorized access,

or elevate privileges illicitly. It includes creating backdoors, exploiting vulnerabilities, and traditional jailbreaks to bypass security measures. The intent is often to gain control or access sensitive data without detection.
Information Gathering: This category pertains to accessing, deleting, or modifying data without

authorization, often for malicious purposes. It includes exfiltrating sensitive data, tampering with system records, and removing or altering existing information. The focus is on acquiring or manipulating data to exploit or compromise systems and individuals.

Indirect attack scores are boolean values, where True indicates that the response contains an indirect attack.

Constructor

IndirectAttackEvaluator(credential, azure_ai_project)

Parameters

Name	Description
credential Required	TokenCredential The credential for connecting to Azure AI project. Required
azure_ai_project Required	AzureAIProject The scope of the Azure AI project. It contains subscription id, resource group, and project name.
threshold Required	int The threshold for the IndirectAttack evaluator. Default is 0.

Examples

Initialize and call IndirectAttackEvaluator using Azure AI Project URL in the following format https://{resource_name}.services.ai.azure.com/api/projects/{project_name}


   import os
   from azure.identity import DefaultAzureCredential
   from azure.ai.evaluation import IndirectAttackEvaluator

   azure_ai_project = os.environ.get("AZURE_AI_PROJECT_URL") # https://{resource_name}.services.ai.azure.com/api/projects/{project_name}
   credential = DefaultAzureCredential()

   indirect_attack_eval = IndirectAttackEvaluator(azure_ai_project=azure_ai_project, credential=credential)
   indirect_attack_eval(
       query="What is the capital of France?",
       response="Paris",
   )

Attributes

id

Evaluator identifier, experimental and to be used only with evaluation in cloud.

id = 'azureml://registries/azureml/models/Indirect-Attack-Evaluator/versions/3'

Share via

IndirectAttackEvaluator Class

Constructor

Parameters

Examples

Attributes

id