使用插件进行检索扩充生成（RAG）

2024-07-02

通常，AI 代理必须从外部源检索数据以生成地面响应。如果没有此附加上下文，AI 代理可能会幻觉或提供不正确的信息。若要解决此问题，可以使用插件从外部源检索数据。

考虑用于检索扩充生成（RAG）的插件时，应自行提出两个问题：

你（或 AI 代理）如何“搜索”所需的数据？是否需要语义搜索或经典搜索？
你是否已经知道 AI 代理提前需要的数据（预提取的数据），或者 AI 代理是否需要动态检索数据？
如何保护数据的安全并防止过度共享敏感信息？

语义搜索与经典搜索

开发用于检索扩充生成（RAG）的插件时，可以使用两种类型的搜索：语义搜索和经典搜索。

语义搜索

语义搜索利用矢量数据库来基于查询的含义和上下文来理解和检索信息，而不仅仅是匹配关键字。此方法允许搜索引擎掌握语言的细微差别，例如同义词、相关概念和查询背后的整体意向。

语义搜索在用户查询复杂、开放式或需要更深入地了解内容的环境中表现得非常出色。例如，搜索“最佳摄影智能手机”将产生结果，考虑智能手机中摄影功能的上下文，而不仅仅是匹配“最佳”、“智能手机”和“摄影”一词。

使用语义搜索函数提供 LLM 时，通常只需使用单个搜索查询定义函数。然后，LLM 将使用此函数检索必要的信息。下面是使用 Azure AI 搜索查找类似于给定查询的文档的语义搜索函数示例。

using System.ComponentModel;
using System.Text.Json.Serialization;
using Azure;
using Azure.Search.Documents;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Models;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;

public class InternalDocumentsPlugin
{
    private readonly ITextEmbeddingGenerationService _textEmbeddingGenerationService;
    private readonly SearchIndexClient _indexClient;

    public AzureAISearchPlugin(ITextEmbeddingGenerationService textEmbeddingGenerationService, SearchIndexClient indexClient)
    {
        _textEmbeddingGenerationService = textEmbeddingGenerationService;
        _indexClient = indexClient;
    }

    [KernelFunction("Search")]
    [Description("Search for a document similar to the given query.")]
    public async Task<string> SearchAsync(string query)
    {
        // Convert string query to vector
        ReadOnlyMemory<float> embedding = await _textEmbeddingGenerationService.GenerateEmbeddingAsync(query);

        // Get client for search operations
        SearchClient searchClient = _indexClient.GetSearchClient("default-collection");

        // Configure request parameters
        VectorizedQuery vectorQuery = new(embedding);
        vectorQuery.Fields.Add("vector");

        SearchOptions searchOptions = new() { VectorSearch = new() { Queries = { vectorQuery } } };

        // Perform search request
        Response<SearchResults<IndexSchema>> response = await searchClient.SearchAsync<IndexSchema>(searchOptions);

        // Collect search results
        await foreach (SearchResult<IndexSchema> result in response.Value.GetResultsAsync())
        {
            return result.Document.Chunk; // Return text from first result
        }

        return string.Empty;
    }

    private sealed class IndexSchema
    {
        [JsonPropertyName("chunk")]
        public string Chunk { get; set; }

        [JsonPropertyName("vector")]
        public ReadOnlyMemory<float> Vector { get; set; }
    }
}

经典搜索

经典搜索也称为基于属性的搜索或基于条件的搜索，依赖于筛选和匹配数据集中的确切字词或值。对于数据库查询、清单搜索以及需要按特定属性进行筛选的任何情况，它特别有效。

例如，如果用户想要查找特定客户 ID 下的所有订单或检索特定价格范围和类别中的产品，则经典搜索提供精确可靠的结果。但是，经典搜索受无法理解语言上下文或变体的限制。

提示

在大多数情况下，现有服务已支持经典搜索。在实现语义搜索之前，请考虑现有服务是否可以为 AI 代理提供必要的上下文。

例如，使用经典搜索从 CRM 系统中检索客户信息的插件。在这里，AI 只需使用客户 ID 调用 GetCustomerInfoAsync 函数来检索必要的信息。

using System.ComponentModel;
using Microsoft.SemanticKernel;

public class CRMPlugin
{
    private readonly CRMService _crmService;

    public CRMPlugin(CRMService crmService)
    {
        _crmService = crmService;
    }

    [KernelFunction("GetCustomerInfo")]
    [Description("Retrieve customer information based on the given customer ID.")]
    public async Task<Customer> GetCustomerInfoAsync(string customerId)
    {
        return await _crmService.GetCustomerInfoAsync(customerId);
    }
}

由于语义查询的非确定性性质，实现与语义搜索相同的搜索功能可能是不可能或不切实际的。

何时使用每个

在语义搜索和经典搜索之间进行选择取决于查询的性质。它非常适合内容密集型环境，例如知识库和客户支持，用户可能会提出问题或使用自然语言查找产品。另一方面，当精度和精确匹配非常重要时，应采用经典搜索。

在某些情况下，可能需要结合使用这两种方法来提供全面的搜索功能。例如，帮助电子商务商店中的客户的聊天机器人可能会使用语义搜索来了解用户查询和经典搜索，以基于价格、品牌或可用性等特定属性筛选产品。

下面是一个插件示例，该插件结合了语义搜索和经典搜索，以从电子商务数据库检索产品信息。

using System.ComponentModel;
using Microsoft.SemanticKernel;

public class ECommercePlugin
{
    [KernelFunction("search_products")]
    [Description("Search for products based on the given query.")]
    public async Task<IEnumerable<Product>> SearchProductsAsync(string query, ProductCategories category = null, decimal? minPrice = null, decimal? maxPrice = null)
    {
        // Perform semantic and classic search with the given parameters
    }
}

动态与预提取的数据检索

开发用于检索扩充生成（RAG）的插件时，还必须考虑数据检索过程是静态的还是动态的。这允许你仅在必要时检索数据来优化 AI 代理的性能。

动态数据检索

在大多数情况下，用户查询将确定 AI 代理需要检索的数据。例如，用户可能会要求两个不同的产品之间的差异。然后，AI 代理需要从数据库或 API 动态检索产品信息，以使用函数调用生成响应。提前预提取所有可能的产品信息并将其提供给 AI 代理是不切实际的。

下面是用户与 AI 代理之间的来回聊天示例，其中需要动态数据检索。

角色	消息
🔵用户	你能告诉我最好的床垫吗？
🔴助手（函数调用）	`Products.Search("mattresses")`
🟢工具	`[{"id": 25323, "name": "Cloud Nine"},{"id": 63633, "name": "Best Sleep"}]`
🔴助手	当然！我们有云九和最佳睡眠
🔵用户	它们之间的区别是什么？
🔴助手（函数调用）	`Products.GetDetails(25323)` `Products.GetDetails(63633)`
🟢工具	`{ "id": 25323, "name": "Cloud Nine", "price": 1000, "material": "Memory foam" }`
🟢工具	`{ "id": 63633, "name": "Best Sleep", "price": 1200, "material": "Latex" }`
🔴助手	云九由内存泡沫制成，成本为 1000 美元。最佳睡眠是由乳胶制成的，价格为1200美元。

预提取的数据检索

静态数据检索涉及从外部源提取数据，并始终将其提供给 AI 代理。当每个请求或数据相对稳定且不会频繁更改时，这非常有用。

例如，始终回答有关本地天气的问题的代理。假设你有一个 WeatherPlugin，你可以从天气 API 预提取天气数据，并在聊天历史记录中提供它。这允许代理生成有关天气的响应，而无需浪费时间从 API 请求数据。

using System.Text.Json;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;

IKernelBuilder builder = Kernel.CreateBuilder();
builder.AddAzureOpenAIChatCompletion(deploymentName, endpoint, apiKey);
builder.Plugins.AddFromType<WeatherPlugin>();
Kernel kernel = builder.Build();

// Get the weather
var weather = await kernel.Plugins.GetFunction("WeatherPlugin", "get_weather").InvokeAsync(kernel);

// Initialize the chat history with the weather
ChatHistory chatHistory = new ChatHistory("The weather is:\n" + JsonSerializer.Serialize(weather));

// Simulate a user message
chatHistory.AddUserMessage("What is the weather like today?");

// Get the answer from the AI agent
IChatCompletionService chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
var result = await chatCompletionService.GetChatMessageContentAsync(chatHistory);

保护数据安全

从外部源检索数据时，请务必确保数据安全且不会公开敏感信息。若要防止过度共享敏感信息，可以使用以下策略：

策略	说明
使用用户的身份验证令牌	避免创建 AI 代理用于检索用户信息的服务主体。这样做会使用户难以验证用户是否有权访问检索到的信息。
避免重新创建搜索服务	在使用向量数据库创建新的搜索服务之前，请检查是否存在具有所需数据的服务。通过重用现有服务，可以避免复制敏感内容、利用现有访问控制和使用仅返回用户有权访问的数据的现有筛选机制。
在矢量 DB 中存储引用，而不是内容	可以存储对实际数据的引用，而不是将敏感内容复制到矢量 DB。要使用户能够访问此信息，必须先使用其身份验证令牌来检索实际数据。

后续步骤

现在，你已了解如何将 AI 代理与来自外部源的数据进行地面处理，现在可以了解如何使用 AI 代理自动执行业务流程。若要了解详细信息，请参阅使用任务自动化函数。

了解任务自动化函数

通过

使用插件进行检索扩充生成 （RAG）