Použití vlastních a místních modelů AI se sadou SDK sémantického jádra
Tento článek ukazuje, jak integrovat vlastní a místní modely do semantické sady SDK jádra a jak je používat pro generování textu a dokončování chatu.
Postup jejich použití můžete přizpůsobit s libovolným modelem, ke kterému máte přístup, a to bez ohledu na to, kde nebo jak k němu přistupujete. Model codellama můžete například integrovat se sadou SDK se sémantickým jádrem a povolit generování a diskuzi kódu.
Vlastní a místní modely často poskytují přístup prostřednictvím rozhraní REST API, například viz kompatibilita Ollama OpenAI. Před integrací modelu bude potřeba hostovat a zpřístupnit aplikaci .NET prostřednictvím protokolu HTTPS.
Požadavky
- Účet Azure s aktivním předplatným. Vytvoření účtu zdarma
- .NET SDK
Microsoft.SemanticKernel
Balíček NuGet- Vlastní nebo místní model, nasazený a přístupný pro vaši aplikaci .NET
Implementace generování textu pomocí místního modelu
Následující část ukazuje, jak můžete integrovat model se sadou SDK sémantického jádra a pak ho použít ke generování dokončování textu.
Vytvořte třídu služby, která implementuje
ITextGenerationService
rozhraní. Příklad:class MyTextGenerationService : ITextGenerationService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async IAsyncEnumerable<StreamingTextContent> GetStreamingTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingTextContent(chunk); } } public async Task<IReadOnlyList<TextContent>> GetTextContentsAsync( string prompt, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromPrompt(prompt, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, TextContent>(completion => new(completion)) .ToImmutableList(); } }
Zahrnout novou třídu služby při sestavování
Kernel
. Příklad:IKernelBuilder builder = Kernel.CreateBuilder(); // Add your text generation service as a singleton instance builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService1", new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your text generation service as a factory method builder.Services.AddKeyedSingleton<ITextGenerationService>( "myTextService2", (_, _) => new MyTextGenerationService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Odešlete do modelu výzvu ke generování textu přímo prostřednictvím
Kernel
třídy služby nebo ji použijte. Příklad:var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a prompt to your model directly through the Kernel // The Kernel response will be null if the model can't be reached string prompt = "Please list three services offered by Azure"; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the text generation service ITextGenerationService textService = kernel.GetRequiredService<ITextGenerationService>(); TextContent responseContents = await textService.GetTextContentAsync( prompt, executionSettings ); Console.WriteLine($"Output: {responseContents.Text}");
Implementace dokončování chatu pomocí místního modelu
Následující část ukazuje, jak můžete integrovat model se sadou SDK se sémantickým jádrem a pak ho použít k dokončení chatu.
Vytvořte třídu služby, která implementuje
IChatCompletionService
rozhraní. Příklad:class MyChatCompletionService : IChatCompletionService { private IReadOnlyDictionary<string, object?>? _attributes; public IReadOnlyDictionary<string, object?> Attributes => _attributes ??= new Dictionary<string, object?>(); public string ModelUrl { get; init; } = "<default url to your model's Chat API>"; public required string ModelApiKey { get; init; } public async Task<IReadOnlyList<ChatMessageContent>> GetChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, CancellationToken cancellationToken = default ) { // Build your model's request object MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Deserialize the response body to your model's response object // Handle when the deserialization fails and returns null MyModelResponse response = await httpResponse.Content.ReadFromJsonAsync<MyModelResponse>(cancellationToken) ?? throw new Exception("Failed to deserialize response from model"); // Convert your model's response into a list of ChatMessageContent return response .Completions.Select<string, ChatMessageContent>(completion => new(AuthorRole.Assistant, completion) ) .ToImmutableList(); } public async IAsyncEnumerable<StreamingChatMessageContent> GetStreamingChatMessageContentsAsync( ChatHistory chatHistory, PromptExecutionSettings? executionSettings = null, Kernel? kernel = null, [EnumeratorCancellation] CancellationToken cancellationToken = default ) { // Build your model's request object, specify that streaming is requested MyModelRequest request = MyModelRequest.FromChatHistory(chatHistory, executionSettings); request.Stream = true; // Send the completion request via HTTP using var httpClient = new HttpClient(); // Send a POST to your model with the serialized request in the body using HttpResponseMessage httpResponse = await httpClient.PostAsJsonAsync( ModelUrl, request, cancellationToken ); // Verify the request was completed successfully httpResponse.EnsureSuccessStatusCode(); // Read your models response as a stream using StreamReader reader = new(await httpResponse.Content.ReadAsStreamAsync(cancellationToken)); // Iteratively read a chunk of the response until the end of the stream // It is more efficient to use a buffer that is the same size as the internal buffer of the stream // If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (2048 UTF-16 characters) char[] buffer = new char[2048]; while (!reader.EndOfStream) { // Check the cancellation token with each iteration cancellationToken.ThrowIfCancellationRequested(); // Fill the buffer with the next set of characters, track how many characters were read int readCount = reader.Read(buffer, 0, buffer.Length); // Convert the character buffer to a string, only include as many characters as were just read string chunk = new(buffer, 0, readCount); yield return new StreamingChatMessageContent(AuthorRole.Assistant, chunk); } } }
Zahrnout novou třídu služby při sestavování
Kernel
. Příklad:IKernelBuilder builder = Kernel.CreateBuilder(); // Add your chat completion service as a singleton instance builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService1", new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Alternatively, add your chat completion service as a factory method builder.Services.AddKeyedSingleton<IChatCompletionService>( "myChatService2", (_, _) => new MyChatCompletionService { // Specify any properties specific to your service, such as the url or API key ModelUrl = "https://localhost:38748", ModelApiKey = "myApiKey" } ); // Add any other Kernel services or configurations // ... Kernel kernel = builder.Build();
Odešlete do modelu výzvu k dokončení chatu
Kernel
přímo přes třídu služby nebo ji použijte. Příklad:var executionSettings = new PromptExecutionSettings { // Add execution settings, such as the ModelID and ExtensionData ModelId = "MyModelId", ExtensionData = new Dictionary<string, object> { { "MaxTokens", 500 } } }; // Send a string representation of the chat history to your model directly through the Kernel // This uses a special syntax to denote the role for each message // For more information on this syntax see: // https://learn.microsoft.com/en-us/semantic-kernel/prompts/your-first-prompt?tabs=Csharp string prompt = """ <message role="system">the initial system message for your chat history</message> <message role="user">the user's initial message</message> """; string? response = await kernel.InvokePromptAsync<string>(prompt); Console.WriteLine($"Output: {response}"); // Alteratively, send a prompt to your model through the chat completion service // First, initialize a chat history with your initial system message string systemMessage = "<the initial system message for your chat history>"; Console.WriteLine($"System Prompt: {systemMessage}"); var chatHistory = new ChatHistory(systemMessage); // Add the user's input to your chat history string userRequest = "<the user's initial message>"; Console.WriteLine($"User: {userRequest}"); chatHistory.AddUserMessage(userRequest); // Get the models response and add it to the chat history IChatCompletionService service = kernel.GetRequiredService<IChatCompletionService>(); ChatMessageContent responseMessage = await service.GetChatMessageContentAsync( chatHistory, executionSettings ); Console.WriteLine($"Assistant: {responseMessage.Content}"); chatHistory.Add(responseMessage); // Continue sending and receiving messages between the user and model // ...