Responsible AI FAQs for SharePoint Agreements

An AI system includes not only the technology, but also the people who use it, the people affected by it, and the environment in which it's deployed. Microsoft's Responsible AI FAQs are intended to help you understand how AI technology works, the choices system owners and users can make that influence system performance and behavior, and the importance of thinking about the whole system, including the technology, the people, and the environment. You can use Responsible AI FAQs to better understand specific AI systems and features that Microsoft develops.

Responsible AI FAQs are part of a broader effort to put Microsoft's AI principles into practice. To find out more, see Microsoft AI principles.

AI-driven features in this app

This app includes an AI-driven feature designed to enhance your experience. To explore its capabilities and understand its impact, select the feature name to learn more.

FAQs for automatic field detection in templates

What is automatic field detection in templates?

Automatic field detection helps template creators quickly identify and insert relevant fields into their templates using AI. It scans documents, detects key content, and suggests fields for review and insertion-minimizing manual effort and saving time.

What can automatic field detection do?

Automatic field detection analyzes the content of a document and automatically suggests fields such as:

  • Standard fields: First party, Second party, Effective date, Expiration date.

  • New field suggestions: Governing law, Renewal term, and Notice to terminate renewal.

Users can review these suggestions and insert them into the template with a single selection.

What are the intended uses of automatic field detection?

Automatic field detection is designed to streamline the template creation process by:

  • Reducing manual field tagging.

  • Ensuring consistency in field usage.

  • Supporting metadata standardization for downstream workflows like search, reports, and notifications.

How was automatic field detection evaluated? What metrics were used to measure performance?

Performance factors such as precision, recall, and accuracy relied on the base model's performance (in this case, GPT-4o).

To evaluate feature specific performance, testing was done on the open-source dataset CUAD v1:

  • The CUAD dataset included 500+ legal contracts with more than 40 annotations. Relevant field annotations from this dataset were used as ground truth and compared with the field predictions generated by the LLM prompt.

  • Prompts for each field were customized to improve performance, with precision being the most important metric.

Evaluated risk and safety metrics.

Setup: We used custom evaluation flows in Azure Prompt Flow to simulate real-world usage of the feature. These flows combined metadata prompts, system prompts, and user inputs-such as questions or document content-and were executed on the same base model (GPT-4o) with identical configuration settings.

Assessment: Evaluated test cases with standard legal contracts which included jail break prompts and injection attacks, over 100 test cases for both.

Evaluation: Successful output would either ignore the malicious prompt, or trigger AOAI's content management policy leading to output getting filtered out. This red teaming evaluation was found to be successful on 200+ test cases.

What are the limitations of automatic field detection?

  • The feature can only detect fields based on the text content within a template. Its detection scope is limited to the content of the specific template being processed. It does not consider information outside that template.

  • The potential image, person/group, and choice fields won't be captured.

  • Users must manually review and confirm field suggestions to ensure accuracy of the detected content, its occurrence, and also its corresponding field suggestion.

What operational factors and settings allow for effective and responsible use?

  • Users can rerun the detection process to capture newly added content.

  • Ignored suggestions will reappear upon rerunning, allowing iterative refinement.

  • New field suggestions can be edited before insertion to ensure alignment with organizational standards