使用 App Content Search 建立應用內內容的語意索引。 這允許用戶根據含義而不僅僅是關鍵字來查找信息。 該指數還可用於增強 AI 助手,提供特定領域的知識,以獲得更個性化和上下文相關的結果。
具體來說,你將學習如何使用 AppContentIndexer API 來:
- 在應用程式中建立或開啟內容的索引
- 將文字字串新增至索引,然後執行查詢
- 管理長文字字串複雜度
- 索引影像資料,然後搜尋相關影像
- 啟用 RAG(檢索增強生成)情境
- 在背景執行緒上使用 AppContentIndexer
- 不再使用時關閉 AppContentIndexer 以釋放資源
先決條件
若要瞭解 Windows AI API 硬體需求,以及如何設定您的裝置,以使用 Windows AI API 成功建置應用程式,請參閱開始 使用 Windows AI API 建置應用程式。
套件身分要求
使用 AppContentIndexer 的應用程式必須具有套件身分識別,這僅適用於封裝的應用程式 (包括具有外部位置的應用程式)。 若要啟用語意索引和文字辨識 (OCR),應用程式也必須宣告功能systemaimodels。
在應用程式中建立或開啟內容的索引
若要建立應用程式中內容的語意索引,您必須先建立可搜尋的結構,應用程式可用來有效率地儲存和擷取內容。 此索引可作為應用程式內容的本機語意和詞彙搜尋引擎。
若要使用 AppContentIndexer API,請先使用指定的索引名稱呼叫 GetOrCreateIndex 。 如果目前應用程式身分識別和使用者已存在具有該名稱的索引,則會開啟該索引;否則,會建立新的。
public void SimpleGetOrCreateIndexSample()
{
GetOrCreateIndexResult result = AppContentIndexer.GetOrCreateIndex("myindex");
if (!result.Succeeded)
{
throw new InvalidOperationException($"Failed to open index. Status = '{result.Status}', Error = '{result.ExtendedError}'");
}
// If result.Succeeded is true, result.Status will either be CreatedNew or OpenedExisting
if (result.Status == GetOrCreateIndexStatus.CreatedNew)
{
Console.WriteLine("Created a new index");
}
else if(result.Status == GetOrCreateIndexStatus.OpenedExisting)
{
Console.WriteLine("Opened an existing index");
}
using AppContentIndexer indexer = result.Indexer;
// Use indexer...
}
此範例顯示開啟索引失敗時的錯誤處理。 為簡單起見,本檔中的其他範例可能不會顯示錯誤處理。
將文字字串新增至索引,然後執行查詢
此範例示範如何將一些文字字串新增至為應用程式建立的索引,然後針對該索引執行查詢以擷取相關資訊。
// This is some text data that we want to add to the index:
Dictionary<string, string> simpleTextData = new Dictionary<string, string>
{
{"item1", "Here is some information about Cats: Cats are cute and fluffy. Young cats are very playful." },
{"item2", "Dogs are loyal and affectionate animals known for their companionship, intelligence, and diverse breeds." },
{"item3", "Fish are aquatic creatures that breathe through gills and come in a vast variety of shapes, sizes, and colors." },
{"item4", "Broccoli is a nutritious green vegetable rich in vitamins, fiber, and antioxidants." },
{"item5", "Computers are powerful electronic devices that process information, perform calculations, and enable communication worldwide." },
{"item6", "Music is a universal language that expresses emotions, tells stories, and connects people through rhythm and melody." },
};
public void SimpleTextIndexingSample()
{
AppContentIndexer indexer = GetIndexerForApp();
// Add some text data to the index:
foreach (var item in simpleTextData)
{
IndexableAppContent textContent = AppManagedIndexableAppContent.CreateFromString(item.Key, item.Value);
indexer.AddOrUpdate(textContent);
}
}
public void SimpleTextQueryingSample()
{
AppContentIndexer indexer = GetIndexerForApp();
// We search the index using a semantic query:
AppIndexTextQuery queryCursor = indexer.CreateTextQuery("Facts about kittens.");
IReadOnlyList<TextQueryMatch> textMatches = queryCursor.GetNextMatches(5);
// Nothing in the index exactly matches what we queried but item1 is similar to the query so we expect
// that to be the first match.
foreach (var match in textMatches)
{
Console.WriteLine(match.ContentId);
if (match.ContentKind == QueryMatchContentKind.AppManagedText)
{
AppManagedTextQueryMatch textResult = (AppManagedTextQueryMatch)match;
// Only part of the original string may match the query. So we can use TextOffset and TextLength to extract the match.
// In this example, we might imagine that the substring "Cats are cute and fluffy" from "item1" is the top match for the query.
string matchingData = simpleTextData[match.ContentId];
string matchingString = matchingData.Substring(textResult.TextOffset, textResult.TextLength);
Console.WriteLine(matchingString);
}
}
}
QueryMatch 僅包含 ContentId 和 TextOffset/TextLength,而不包含相符的文字本身。 作為應用程序開發人員,您有責任引用原文。 查詢結果會依相關性排序,最相關的結果最相關。 索引會以非同步方式進行,因此查詢可能會在部分資料上執行。 您可以檢查索引狀態,如下所述。
管理長文字字串複雜度
此範例示範應用程式開發人員不需要將文字內容分割成較小的區段以進行模型處理。 AppContentIndexer 會管理這方面的複雜性。
Dictionary<string, string> textFiles = new Dictionary<string, string>
{
{"file1", "File1.txt" },
{"file2", "File2.txt" },
{"file3", "File3.txt" },
};
public void TextIndexingSample2()
{
AppContentIndexer indexer = GetIndexerForApp();
var folderPath = Windows.ApplicationModel.Package.Current.InstalledLocation.Path;
// Add some text data to the index:
foreach (var item in textFiles)
{
string contentId = item.Key;
string filename = item.Value;
// Note that the text here can be arbitrarily large. The AppContentIndexer will take care of chunking the text
// in a way that works effectively with the underlying model. We do not require the app author to break the text
// down into small pieces.
string text = File.ReadAllText(Path.Combine(folderPath, filename));
IndexableAppContent textContent = AppManagedIndexableAppContent.CreateFromString(contentId, text);
indexer.AddOrUpdate(textContent);
}
}
public void TextIndexingSample2_RunQuery()
{
AppContentIndexer indexer = GetIndexerForApp();
var folderPath = Windows.ApplicationModel.Package.Current.InstalledLocation.Path;
// Search the index
AppIndexTextQuery query = indexer.CreateTextQuery("Facts about kittens.");
IReadOnlyList<TextQueryMatch> textMatches = query.GetNextMatches(5);
if (textMatches != null)
{
foreach (var match in textMatches)
{
Console.WriteLine(match.ContentId);
if (match is AppManagedTextQueryMatch textResult)
{
// We load the content of the file that contains the match:
string matchingFilename = textFiles[match.ContentId];
string fileContent = File.ReadAllText(Path.Combine(folderPath, matchingFilename));
// Find the substring within the loaded text that contains the match:
string matchingString = fileContent.Substring(textResult.TextOffset, textResult.TextLength);
Console.WriteLine(matchingString);
}
}
}
}
文字資料來自檔案,但只會索引內容,而不是檔案本身。 AppContentIndexer 不知道原始檔案,也不會監視更新。 如果檔案內容變更,應用程式必須手動更新索引。
索引影像資料,然後搜尋相關影像
此範例示範如何將影像資料編製索引, SoftwareBitmaps 然後使用文字查詢搜尋相關影像。
// We load the image data from a set of known files and send that image data to the indexer.
// The image data does not need to come from files on disk, it can come from anywhere.
Dictionary<string, string> imageFilesToIndex = new Dictionary<string, string>
{
{"item1", "Cat.jpg" },
{"item2", "Dog.jpg" },
{"item3", "Fish.jpg" },
{"item4", "Broccoli.jpg" },
{"item5", "Computer.jpg" },
{"item6", "Music.jpg" },
};
public void SimpleImageIndexingSample()
{
AppContentIndexer indexer = GetIndexerForApp();
// Add some image data to the index.
foreach (var item in imageFilesToIndex)
{
var file = item.Value;
var softwareBitmap = Helpers.GetSoftwareBitmapFromFile(file);
IndexableAppContent imageContent = AppManagedIndexableAppContent.CreateFromBitmap(item.Key, softwareBitmap);
indexer.AddOrUpdate(imageContent);
}
}
public void SimpleImageIndexingSample_RunQuery()
{
AppContentIndexer indexer = GetIndexerForApp();
// We query the index for some data to match our text query.
AppIndexImageQuery query = indexer.CreateImageQuery("cute pictures of kittens");
IReadOnlyList<ImageQueryMatch> imageMatches = query.GetNextMatches(5);
// One of the images that we indexed was a photo of a cat. We expect this to be the first match to match the query.
foreach (var match in imageMatches)
{
Console.WriteLine(match.ContentId);
if (match.ContentKind == QueryMatchContentKind.AppManagedImage)
{
AppManagedImageQueryMatch imageResult = (AppManagedImageQueryMatch)match;
var matchingFileName = imageFilesToIndex[match.ContentId];
// It might be that the match is at a particular region in the image. The result includes
// the subregion of the image that includes the match.
Console.WriteLine($"Matching file: '{matchingFileName}' at location {imageResult.Subregion}");
}
}
}
啟用 RAG(檢索增強生成)情境
RAG(檢索增強生成)涉及透過其他相關數據來增強用戶對語言模型的查詢,以用於生成響應。 使用者的查詢可作為語意搜尋的輸入,以識別索引中的相關資訊。 然後,語義搜尋的結果資料被合併到提供給語言模型的提示中,以便產生更準確和上下文感知的回應。
本範例展示了如何使用 AppContentIndexer API 搭配大型語言模型,為您的應用程式使用者的搜尋查詢加入上下文資料。 此範例是泛型,未指定 LLM,且範例只會查詢儲存在所建立索引中的本機資料 (沒有外部呼叫網際網路)。 在此範例中, Helpers.GetUserPrompt() 和 不是 Helpers.GetResponseFromChatAgent() 實際函式,只是用來提供範例。
若要使用 AppContentIndexer API 啟用 RAG 案例,您可以遵循下列範例:
public void SimpleRAGScenario()
{
AppContentIndexer indexer = GetIndexerForApp();
// These are some text files that had previously been added to the index.
// The key is the contentId of the item.
Dictionary<string, string> data = new Dictionary<string, string>
{
{"file1", "File1.txt" },
{"file2", "File2.txt" },
{"file3", "File3.txt" },
};
string userPrompt = Helpers.GetUserPrompt();
// We execute a query against the index using the user's prompt string as the query text.
AppIndexTextQuery query = indexer.CreateTextQuery(userPrompt);
IReadOnlyList<TextQueryMatch> textMatches = query.GetNextMatches(5);
StringBuilder promptStringBuilder = new StringBuilder();
promptStringBuilder.AppendLine("Please refer to the following pieces of information when responding to the user's prompt:");
// For each of the matches found, we include the relevant snippets of the text files in the augmented query that we send to the language model
foreach (var match in textMatches)
{
if (match is AppManagedTextQueryMatch textResult)
{
// We load the content of the file that contains the match:
string matchingFilename = data[match.ContentId];
string fileContent = File.ReadAllText(matchingFilename);
// Find the substring within the loaded text that contains the match:
string matchingString = fileContent.Substring(textResult.TextOffset, textResult.TextLength);
promptStringBuilder.AppendLine(matchingString);
promptStringBuilder.AppendLine();
}
}
promptStringBuilder.AppendLine("Please provide a response to the following user prompt:");
promptStringBuilder.AppendLine(userPrompt);
var response = Helpers.GetResponseFromChatAgent(promptStringBuilder.ToString());
Console.WriteLine(response);
}
在背景執行緒上使用 AppContentIndexer
AppContentIndexer 實例未與特定執行緒相關聯;它是一個可以跨執行程操作的敏捷對象。 AppContentIndexer 及其相關類型的某些方法可能需要相當長的處理時間。 因此,建議避免直接從應用程式的 UI 執行程叫用 AppContentIndexer API,而是使用背景執行緒。
不再使用時關閉 AppContentIndexer 以釋放資源
AppContentIndexer 會 IClosable 實作介面,以判斷其存留期。 應用程式應該在索引子不再使用時將其關閉。 這可讓 AppContentIndexer 釋放其基礎資源。
public void IndexerDisposeSample()
{
var indexer = AppContentIndexer.GetOrCreateIndex("myindex").Indexer;
// use indexer
indexer.Dispose();
// after this point, it would be an error to try to use indexer since it is now Closed.
}
在 C# 程式碼中, IClosable 介面會投影為 IDisposable。 C# 程式碼可以使用 using 模式來為 AppContentIndexer 執行個體建立索引。
public void IndexerUsingSample()
{
using var indexer = AppContentIndexer.GetOrCreateIndex("myindex").Indexer;
// use indexer
//indexer.Dispose() is automatically called
}
如果您在應用程式中多次開啟相同的索引,必須針對每個實例呼叫Close。
開啟和關閉索引是一項昂貴的作業,因此您應該將應用程式中的這類作業降到最低。 例如,應用程式可以儲存應用程式的 AppContentIndexer 單一實例,並在應用程式的整個存留期內使用該實例,而不是持續開啟和關閉需要執行的每個動作的索引。