Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Built-in Support for Embedding Generation in the Vector Store
The April 2025 update introduces built-in support for embedding generation directly within the vector store. By configuring an embedding generator, you can now automatically generate embeddings for vector properties without needing to precompute them externally. This feature simplifies workflows and reduces the need for additional preprocessing steps.
Configuring an Embedding Generator
Embedding generators implementing the Microsoft.Extensions.AI
abstractions are supported and can be configured at various levels:
On the Vector Store: You can set a default embedding generator for the entire vector store. This generator will be used for all collections and properties unless overridden.
using Microsoft.Extensions.AI; using Microsoft.SemanticKernel.Connectors.Qdrant; using OpenAI; using Qdrant.Client; var embeddingGenerator = new OpenAIClient("your key") .GetEmbeddingClient("your chosen model") .AsIEmbeddingGenerator(); var vectorStore = new QdrantVectorStore(new QdrantClient("localhost"), new QdrantVectorStoreOptions { EmbeddingGenerator = embeddingGenerator });
On a Collection: You can configure an embedding generator for a specific collection, overriding the store-level generator.
using Microsoft.Extensions.AI; using Microsoft.SemanticKernel.Connectors.Qdrant; using OpenAI; using Qdrant.Client; var embeddingGenerator = new OpenAIClient("your key") .GetEmbeddingClient("your chosen model") .AsIEmbeddingGenerator(); var collectionOptions = new QdrantVectorStoreRecordCollectionOptions<MyRecord> { EmbeddingGenerator = embeddingGenerator }; var collection = new QdrantVectorStoreRecordCollection<ulong, MyRecord>(new QdrantClient("localhost"), "myCollection", collectionOptions);
On a Record Definition: When defining properties programmatically using
VectorStoreRecordDefinition
, you can specify an embedding generator for all properties.using Microsoft.Extensions.AI; using Microsoft.Extensions.VectorData; using Microsoft.SemanticKernel.Connectors.Qdrant; using OpenAI; using Qdrant.Client; var embeddingGenerator = new OpenAIClient("your key") .GetEmbeddingClient("your chosen model") .AsIEmbeddingGenerator(); var recordDefinition = new VectorStoreRecordDefinition { EmbeddingGenerator = embeddingGenerator, Properties = new List<VectorStoreRecordProperty> { new VectorStoreRecordKeyProperty("Key", typeof(ulong)), new VectorStoreRecordVectorProperty("DescriptionEmbedding", typeof(string), dimensions: 1536) } }; var collectionOptions = new QdrantVectorStoreRecordCollectionOptions<MyRecord> { VectorStoreRecordDefinition = recordDefinition }; var collection = new QdrantVectorStoreRecordCollection<ulong, MyRecord>(new QdrantClient("localhost"), "myCollection", collectionOptions);
On a Vector Property Definition: When defining properties programmatically, you can set an embedding generator directly on the property.
using Microsoft.Extensions.AI; using Microsoft.Extensions.VectorData; using OpenAI; var embeddingGenerator = new OpenAIClient("your key") .GetEmbeddingClient("your chosen model") .AsIEmbeddingGenerator(); var vectorProperty = new VectorStoreRecordVectorProperty("DescriptionEmbedding", typeof(string), dimensions: 1536) { EmbeddingGenerator = embeddingGenerator };
Example Usage
The following example demonstrates how to use the embedding generator to automatically generate vectors during both upsert and search operations. This approach simplifies workflows by eliminating the need to precompute embeddings manually.
// The data model
internal class FinanceInfo
{
[VectorStoreRecordKey]
public string Key { get; set; } = string.Empty;
[VectorStoreRecordData]
public string Text { get; set; } = string.Empty;
// Note that the vector property is typed as a string, and
// its value is derived from the Text property. The string
// value will however be converted to a vector on upsert and
// stored in the database as a vector.
[VectorStoreRecordVector(1536)]
public string Embedding => this.Text;
}
// Create an OpenAI embedding generator.
var embeddingGenerator = new OpenAIClient("your key")
.GetEmbeddingClient("your chosen model")
.AsIEmbeddingGenerator();
// Use the embedding generator with the vector store.
var vectorStore = new InMemoryVectorStore(new() { EmbeddingGenerator = embeddingGenerator });
var collection = vectorStore.GetCollection<string, FinanceInfo>("finances");
await collection.CreateCollectionAsync();
// Create some test data.
string[] budgetInfo =
{
"The budget for 2020 is EUR 100 000",
"The budget for 2021 is EUR 120 000",
"The budget for 2022 is EUR 150 000",
"The budget for 2023 is EUR 200 000",
"The budget for 2024 is EUR 364 000"
};
// Embeddings are generated automatically on upsert.
var records = budgetInfo.Select((input, index) => new FinanceInfo { Key = index.ToString(), Text = input });
await collection.UpsertAsync(records);
// Embeddings for the search is automatically generated on search.
var searchResult = collection.SearchAsync(
"What is my budget for 2024?",
top: 1);
// Output the matching result.
await foreach (var result in searchResult)
{
Console.WriteLine($"Key: {result.Record.Key}, Text: {result.Record.Text}");
}
Transition from IVectorizableTextSearch
and IVectorizedSearch
to IVectorSearch
The IVectorizableTextSearch
and IVectorizedSearch
interfaces have been marked as obsolete and replaced by the more unified and flexible IVectorSearch
interface.
This change simplifies the API surface and provides a more consistent approach to vector search operations.
Key Changes
Unified Interface:
- The
IVectorSearch
interface consolidates the functionality of bothIVectorizableTextSearch
andIVectorizedSearch
into a single interface.
- The
Method Renaming:
VectorizableTextSearchAsync
fromIVectorizableTextSearch
has been replaced bySearchAsync
inIVectorSearch
.VectorizedSearchAsync
fromIVectorizedSearch
has been replaced bySearchEmbeddingAsync
inIVectorSearch
.
Improved Flexibility:
- The
SearchAsync
method inIVectorSearch
handles embedding generation, supporting either local embedding, if an embedding generator is configured, or server side embedding. - The
SearchEmbeddingAsync
method inIVectorSearch
allows for direct embedding-based searches, providing a low-level API for advanced use cases.
- The
Return type change for search methods
In addition to the change in naming for search methods, the return type for all search methods have been changed to simplify usage.
The result type of search methods is now IAsyncEnumerable<VectorSearchResult<TRecord>>
, which allows for looping through the results
directly. Previously the returned object contained an IAsyncEnumerable property.
Support for search without vectors / filtered get
The April 2025 update introduces support for finding records using a filter and returning the results with a configurable sort order. This allows enumerating records in a predictable order, which is particularly useful when needing to sync the vector store with an external data source.
Example: Using filtered GetAsync
The following example demonstrates how to use the GetAsync
method with a filter and options to retrieve records from a vector store collection. This approach allows you to apply filtering criteria and sort the results in a predictable order.
// Define a filter to retrieve products priced above $600
Expression<Func<ProductInfo, bool>> filter = product => product.Price > 600;
// Define the options with a sort order
var options = new GetFilteredRecordOptions<ProductInfo>();
options.OrderBy.Descending(product => product.Price);
// Use GetAsync with the filter and sort order
var filteredProducts = await collection.GetAsync(filter, top: 10, options)
.ToListAsync();
This example demonstrates how to use the GetAsync
method to retrieve filtered records and sort them based on specific criteria, such as price.
New methods on IVectorStore
Some new methods are available on the IVectorStore
interface that allow you to perform more operations directly without needing a VectorStoreRecordCollection object.
Check if a Collection Exists
You can now verify whether a collection exists in the vector store without having to create a VectorStoreRecordCollection object.
// Example: Check if a collection exists
bool exists = await vectorStore.CollectionExistsAsync("myCollection", cancellationToken);
if (exists)
{
Console.WriteLine("The collection exists.");
}
else
{
Console.WriteLine("The collection does not exist.");
}
Delete a Collection
A new method allows you to delete a collection from the vector store without having to create a VectorStoreRecordCollection object.
// Example: Delete a collection
await vectorStore.DeleteCollectionAsync("myCollection", cancellationToken);
Console.WriteLine("The collection has been deleted.");
Replacement of VectorStoreGenericDataModel<TKey>
with Dictionary<string, object?>
The vector data abstractions support working with databases where the schema of a collection is not known at build time.
Previously this was supported via the VectorStoreGenericDataModel<TKey>
type, where this model can be used in place of
a custom data model.
In this release, the VectorStoreGenericDataModel<TKey>
has been obsoleted, and the recommended approach is to use
a Dictionary<string, object?>
instead.
As before, a record definition needs to be supplied to determine the schema of the collection. Also note that the
key type required when getting the collection instance is object
, while in the schema it is fixed to string
.
var recordDefinition = new VectorStoreRecordDefinition
{
Properties = new List<VectorStoreRecordProperty>
{
new VectorStoreRecordKeyProperty("Key", typeof(string)),
new VectorStoreRecordDataProperty("Text", typeof(string)),
new VectorStoreRecordVectorProperty("Embedding", typeof(ReadOnlyMemory<float>), 1536)
}
};
var collection = vectorStore.GetCollection<object, Dictionary<string, object?>>("finances", recordDefinition);
var record = new Dictionary<string, object?>
{
{ "Key", "1" },
{ "Text", "The budget for 2024 is EUR 364 000" },
{ "Embedding", vector }
};
await collection.UpsertAsync(record);
var retrievedRecord = await collection.GetAsync("1");
Console.WriteLine(retrievedRecord["Text"]);
Change in Batch method naming convention
The IVectorStoreRecordCollection
interface has been updated to improve consistency in method naming conventions.
Specifically, the batch methods have been renamed to remove the "Batch" part of their names. This change aligns with a more concise naming convention.
Renaming Examples
Old Method:
GetBatchAsync(IEnumerable<TKey> keys, ...)
New Method:GetAsync(IEnumerable<TKey> keys, ...)
Old Method:
DeleteBatchAsync(IEnumerable<TKey> keys, ...)
New Method:DeleteAsync(IEnumerable<TKey> keys, ...)
Old Method:
UpsertBatchAsync(IEnumerable<TRecord> records, ...)
New Method:UpsertAsync(IEnumerable<TRecord> records, ...)
Return type change for batch Upsert method
The return type for the batch upsert method has been changed from IAsyncEnumerable<TKey>
to Task<IReadOnlyList<TKey>>
.
This change impacts how the method is consumed. You can now simply await the result and get back a list of keys.
Previously, to ensure that all upserts were completed, the IAsyncEnumerable had to be completely enumerated.
This simplifies the developer experience when using the batch upsert method.
The CollectionName property has been changed to Name
The CollectionName
property on the IVectorStoreRecordCollection
interface has renamed to Name
.
IsFilterable and IsFullTextSearchable renamed to IsIndexed and IsFullTextIndexed
The IsFilterable
and IsFullTextSearchable
properties on the VectorStoreRecordDataAttribute
and VectorStoreRecordDataProperty
classes have been renamed to
IsIndexed
and IsFullTextIndexed
respectively.
Dimensions are now required for vector attributes and definitions
In the April 2025 update, specifying the number of dimensions has become mandatory when using vector attributes or vector property definitions. This ensures that the vector store always has the necessary information to handle embeddings correctly.
Changes to VectorStoreRecordVectorAttribute
Previously, the VectorStoreRecordVectorAttribute
allowed you to omit the Dimensions
parameter. This is no longer allowed, and the Dimensions
parameter must now be explicitly provided.
Before:
[VectorStoreRecordVector]
public ReadOnlyMemory<float> DefinitionEmbedding { get; set; }
After:
[VectorStoreRecordVector(Dimensions: 1536)]
public ReadOnlyMemory<float> DefinitionEmbedding { get; set; }
Changes to VectorStoreRecordVectorProperty
Similarly, when defining a vector property programmatically using VectorStoreRecordVectorProperty
, the dimensions
parameter is now required.
Before:
var vectorProperty = new VectorStoreRecordVectorProperty("DefinitionEmbedding", typeof(ReadOnlyMemory<float>));
After:
var vectorProperty = new VectorStoreRecordVectorProperty("DefinitionEmbedding", typeof(ReadOnlyMemory<float>), dimensions: 1536);
All collections require the key type to be passed as a generic type parameter
When constructing a collection directly, it is now required to provide the TKey
generic type parameter.
Previously, where some databases allowed only one key type, it was now a required parameter, but to allow using
collections with a Dictionary<string, object?>
type and an object
key type, TKey
must now always be provided.
Not Applicable
These changes are currently only applicable in C#
Not Applicable
These changes are currently only applicable in C#