Create a declarative agent with an API plugin

Completed

By adding actions to a declarative agent, you allow it to retrieve and update data stored in external systems. Connecting an agent to external systems that you use in your organization allows you to use agents to support your business processes. The following sections explain the different elements involved in extending a declarative agent with actions.

Declarative agent

Declarative agents can include one or more actions that allow them to interact with external systems in real time. Through actions, agents can read and modify data stored in an external application. An action connects to an API through an API plugin. The agent defines its actions in the manifest using the actions array:

{
  "$schema": "https://developer.microsoft.com/json-schemas/copilot/declarative-agent/v1.0/schema.json",
  "version": "v1.0",
  "name": "Il Ristorante",
  "description": "Order the most delicious Italian dishes and drinks from the comfort of your desk.",
  "instructions": "$[file('instruction.txt')]",
  "actions": [
    {
      "id": "menuPlugin",
      "file": "ai-plugin.json"
    }
  ]
}

You define an action by adding an element to the actions array. Each element uniquely identifies an action using an id and uses the file property to refer to a separate plugin definition file in the project that describes the API plugin.

Plugin definition

A plugin definition file describes an API plugin that a declarative agent uses to communicate with an API. The plugin definition consists of several sections, such as the basic information, functions, and runtimes.

Basic information

Each plugin definition file contains basic information about the plugin, such as its name and description. The following code snippet shows an example of a basic plugin information:

{
  "$schema": "https://developer.microsoft.com/json-schemas/copilot/plugin/v2.1/schema.json",
  "schema_version": "v2.1",
  "namespace": "ilristorante",
  "name_for_human": "Il Ristorante",
  "description_for_human": "See the today's menu and place orders",
  "description_for_model": "Plugin for getting the today's menu, optionally filtered by course and allergens, and placing orders",
  "functions": [
  ],
  "runtimes": [
  ],
  "capabilities": {
    "localization": {},
    "conversation_starters": []
  }
}

The contents of name_for_human, and description_for_human properties are purely informative. The description_for_model property is important because the agent uses it to decide if it should invoke the plugin for the user's prompt. If you see your agent not invoking your plugin for specific prompts, you should check if the description for the model contains the necessary information for the agent to consider it relevant.

Another important property is namespace which is required, and which the agent uses to disambiguate actions across the different plugins. If you remove it or provide an invalid value that doesn't match the schema, it might prevent the agent from using your plugin. The namespace must match the following regular expression ^[A-Za-z0-9_]+, which means that it must consist of at least one character, such as A-Z, a-z, 0-9, or _. Any other character is invalid.

Functions

The next section of plugin definition is functions. Functions define one or more API operations that the API plugin can perform and instruct the agent how to show the data it receives from the API. The following code snippet shows an example function:

{
  "functions": [
    {
      "name": "getDishes",
      "description": "Returns information about the dishes on the menu. Can filter by course (breakfast, lunch or dinner), name, allergens, or type (dish, drink).",
      "capabilities": {
        "response_semantics": {
          "data_path": "$.dishes",
          "properties": {
            "title": "$.name",
            "subtitle": "$.description"
          },
          "static_template": {
            ...trimmed for brevity
          }
        }
      }
    }
  ]
}

Each function consists of several elements.

Name

The name uniquely identifies the operation in the API plugin, and which must exactly match an operationId from the related API specification. If the name you specify doesn't match any operation, Microsoft 365 Agents Toolkit throws an error when building the project. If you deploy a function with a name that doesn't match an operationId, the agent can't invoke that function.

Description

The agent uses the description to match a function to a user's prompt. When describing the function, be sure to explain what tasks it completes, including any variations, such as filtering or sorting information. If the description is inaccurate or incomplete, the agent can't match it against the specific prompt and can't invoke the function.

Response semantics

The response_semantics property instructs the agent how it should display data it receives from the API. It consists of three properties: data_path, properties, and static_template.

If your API returns a complex data structure, and you want the agent to only show a specific part of it, you use the data_path property to specify a JSON path expression that points to the relevant part of the API response. Consider the following API response:

{
  "dishes": [
    {
      "id": 1,
      "name": "Classic Italian Frittata",
      "description": "A fluffy omelette filled with sautéed mushrooms, onions, and melted pecorino, served with a side of roasted cherry tomatoes.",
      "image_url": "https://raw.githubusercontent.com/pnp/copilot-pro-dev-samples/main/samples/da-ristorante-api/assets/frittata.jpeg",
      "price": 8.99,
      "allergens": [
        "eggs",
        "dairy"
      ],
      "course": "breakfast",
      "type": "dish"
    },
    ...trimmed for brevity
  ]
}

The data that you want the agent to show is in the dishes property which is why you set the data_path property to the $.dishes JSONPath expression which refers to the dishes property of the root object denoted by $.

The next part of response semantics are properties. Using properties, you tell the agent which of the data properties from the API response represent the item's properties such as title, description, or URL. When your API returns multiple items, the agent uses the semantic mapping to include the most relevant information in the reply. Consider the following semantic mapping:

{
  "response_semantics": {
    "properties": {
      "title": "$.name",
      "subtitle": "$.description"
    }
  }
}

When the agent responds, it produces an answer like:

Screenshot of a declarative agent returning a semantic response.

For each dish, the agent includes a title in bold, followed by a description. Because the mapping doesn't include a URL or sensitivity label, the agent doesn't include them in its response.

The final part of response semantics is static_template. You use static template to define an Adaptive Card template that the agent should use to display data from the API.

Tip

To learn more about using Adaptive Card templates with API plugins, see the Use Adaptive Cards to show data in API plugins for declarative agents learn module in the More resources section at the end of this learn module.

Runtimes

The final part of the plugin definition is runtimes. Runtimes describe which APIs the plugin uses, and which functions belong to which API. Following snippet shows a runtime definition:

{
  "type": "OpenApi",
  "auth": {
    "type": "None"
  },
  "spec": {
    "url": "apiSpecificationFile/ristorante.yml"
  },
  "run_for_functions": [
    "getDishes",
    "placeOrder"
  ]
}

You start with defining the type of API description. Right now, API plugins support only OpenAPI.

Next, you define if the API is anonymous or requires authentication. Auth type None means that the agent can call the API anonymously. If you need to communicate with a secured API, update the value accordingly to match the API's authentication mechanism. For more information about the supported authentication mechanisms, see the documentation.

Tip

To learn more about connecting API plugins to secured APIs, see the Authenticate your API plugin for declarative agents with secured APIs learn module in the More resources section at the end of this learn module.

In the next section named spec, you provide a reference to a local API specification document that describes the API that the API plugin can use. Using the url property, you specify a relative path to the file in the project.

The final part is the run_for_functions property which specifies which of the specified functions belong to this API.

Tip

When building the project, Microsoft 365 Agents Toolkit verifies that the functions specified in this property match functions defined in the functions section and fails with an error if they don't. Microsoft 365 Agents Toolkit checking your project for consistency gives you early feedback and helps you prevent hard to debug errors.

API specification

An important part of each API plugin is the API specification, which provides important information about the API including:

  • Where the API is located.
  • If the API requires authentication and if so, in what way.
  • What operations the API supports.
  • For each operation, which data it expects and how it can respond.

When an agent loads an API plugin, it uses all this information to build an API request, call the API and process its response. It's therefore important that you clearly describe all parameters and properties, so that the agent understands how to use them to fulfill the user's request.

If you plan to use an existing API, be sure to only include the portion of its API specification that you plan to use in the API plugin. If you include the whole API specification, but say only use one or two operations, you make it harder for the agent to parse the relevant information out of the large API specification.

Tip

If you need to use a portion of your existing API specification, consider using Hidi. It's a tool built by Microsoft that allows you to easily extract a relevant portion of an API specification along with all related entities. Find more information at the end of this learn module.

Agents support OpenAPI API specifications in both YAML and JSON.

Tip

When you build a custom API for use by your API plugin, and run it from your local computer, you need to expose it to the internet so that the agent can call it. You can expose your API using dev tunnels – a tool created by Microsoft for sharing local services across the internet. If you use Microsoft 365 Agents Toolkit to build your agent with API plugin, it not only automatically starts a dev tunnel, but it also automatically updates the URL of your API in the API specification allowing you to focus on building your solution.

How it fits together

Now that you know what different elements a declarative agent with an API plugin consists of, let's have a look at how they fit together. The following diagram shows the relationships between the different elements.

Diagram that shows the relationship between the building blocks of declarative agents.

You use Microsoft Teams apps to package and distribute your agents. Each Teams app can contain one or more declarative agents, each optimized for a specific scenario. An agent, depending on its purpose, can contain zero or more API plugins that allow it to communicate with external systems. A plugin defines one or more functions. Each function performs a specific task and refers to exactly one API operation. Next to functions, an API plugin refers to one or more API specifications that describe the APIs that it uses. On its turn, each API specification defines one or more operations that a plugin can use through its functions.