Integrated data chunking and embedding in Azure AI Search

Important

This feature is in public preview under Supplemental Terms of Use. The 2023-10-01-Preview REST API supports this feature.

Integrated vectorization adds data chunking and text-to-vector embedding to skills in indexer-based indexing. It also adds text-to-vector conversions to queries.

This capability is preview-only. In the generally available version of vector search and in previous preview versions, data chunking and vectorization rely on external components for chunking and vectors, and your application code must handle and coordinate each step. In this preview, chunking and vectorization are built into indexing through skills and indexers. You can set up a skillset that chunks data using the Text Split skill, and then call an embedding model using either the AzureOpenAIEmbedding skill or a custom skill. Any vectorizers used during indexing can also be called on queries to convert text to vectors.

For indexing, integrated vectorization requires:

For queries:

  • A vectorizer defined in the index schema, assigned to a vector field, and used automatically at query time to convert a text query to a vector.

Vector conversions are one-way: text-to-vector. There's no vector-to-text conversion for queries or results (for example, you can't convert a vector result to a human-readable string).

Component diagram

The following diagram shows the components of integrated vectorization.

Diagram of components in an integrated vectorization workflow.

Here's a checklist of the components responsible for integrated vectorization:

  • A supported data source for indexer-based indexing.
  • An index that specifies vector fields, and a vectorizer definition assigned to vector fields.
  • A skillset providing a Text Split skill for data chunking, and a skill for vectorization (either the AzureOpenAiEmbedding skill or a custom skill pointing to an external embedding model).
  • Optionally, index projections (also defined in a skillset) to push chunked data to a secondary index
  • An embedding model, deployed on Azure OpenAI or available through an HTTP endpoint.
  • An indexer for driving the process end-to-end. An indexer also specifies a schedule, field mappings, and properties for change detection.

This checklist focuses on integrated vectorization, but your solution isn't limited to this list. You can add more skills for AI enrichment, create a knowledge store, add semantic ranking, add relevance tuning, and other query features.

Availability and pricing

Integrated vectorization availability is based on the embedding model. If you're using Azure OpenAI, check regional availability.

If you're using a custom skill and an Azure hosting mechanism (such as an Azure function app, Azure Web App, and Azure Kubernetes), check the product by region page for feature availability.

Data chunking (Text Split skill) is free and available on all Azure AI services in all regions.

Note

Some older search services created before January 1, 2019 are deployed on infrastructure that doesn't support vector workloads. If you try to add a vector field to a schema and get an error, it's a result of outdated services. In this situation, you must create a new search service to try out the vector feature.

What scenarios can integrated vectorization support?

  • Subdivide large documents into chunks, useful for vector and non-vector scenarios. For vectors, chunks help you meet the input constraints of embedding models. For non-vector scenarios, you might have a chat-style search app where GPT is assembling responses from indexed chunks. You can use vectorized or non-vectorized chunks for chat-style search.

  • Build a vector store where all of the fields are vector fields, and the document ID (required for a search index) is the only string field. Query the vector store to retrieve document IDs, and then send the document's vector fields to another model.

  • Combine vector and text fields for hybrid search, with or without semantic ranking. Integrated vectorization simplifies all of the scenarios supported by vector search.

When to use integrated vectorization

We recommend using the built-in vectorization support of Azure AI Studio. If this approach doesn't meet your needs, you can create indexers and skillsets that invoke integrated vectorization using the programmatic interfaces of Azure AI Search.

How to use integrated vectorization

For query-only vectorization:

  1. Add a vectorizer to an index. It should be the same embedding model used to generate vectors in the index.
  2. Assign the vectorizer to a vector profile, and then assign a vector profile to the vector field.
  3. Formulate a vector query that specifies the text string to vectorize.

A more common scenario - data chunking and vectorization during indexing:

  1. Create a data source connection to a supported data source for indexer-based indexing.
  2. Create a skillset that calls Text Split skill for chunking and AzureOpenAIEmbeddingModel or a custom skill to vectorize the chunks.
  3. Create an index that specifies a vectorizer for query time, and assign it to vector fields.
  4. Create an indexer to drive everything, from data retrieval, to skillset execution, through indexing.

Optionally, create secondary indexes for advanced scenarios where chunked content is in one index, and non-chunked in another index. Chunked indexes (or secondary indexes) are useful for RAG apps.

Tip

Try the new Import and vectorize data wizard in the Azure portal to explore integrated vectorization before writing any code.

Or, configure a Jupyter notebook to run the same workflow, cell by cell, to see how each step works.

Limitations

Make sure you know the Azure OpenAI quotas and limits for embedding models. Azure AI Search has retry policies, but if the quota is exhausted, retries fail.

Azure OpenAI token-per-minute limits are per model, per subscription. Keep this in mind if you're using an embedding model for both query and indexing workloads. If possible, follow best practices. Have an embedding model for each workload, and try to deploy them in different subscriptions.

On Azure AI Search, remember there are service limits by tier and workloads.

Finally, the following features aren't currently supported:

Benefits of integrated vectorization

Here are some of the key benefits of the integrated vectorization:

  • No separate data chunking and vectorization pipeline. Code is simpler to write and maintain.

  • Automate indexing end-to-end. When data changes in the source (such as in Azure Storage, Azure SQL, or Cosmos DB), the indexer can move those updates through the entire pipeline, from retrieval, to document cracking, through optional AI-enrichment, data chunking, vectorization, and indexing.

  • Projecting chunked content to secondary indexes. Secondary indexes are created as you would any search index (a schema with fields and other constructs), but they're populated in tandem with a primary index by an indexer. Content from each source document flows to fields in primary and secondary indexes during the same indexing run.

    Secondary indexes are intended for data chunking and Retrieval Augmented Generation (RAG) apps. Assuming a large PDF as a source document, the primary index might have basic information (title, date, author, description), and a secondary index has the chunks of content. Vectorization at the data chunk level makes it easier to find relevant information (each chunk is searchable) and return a relevant response, especially in a chat-style search app.

Chunked indexes

Chunking is a process of dividing content into smaller manageable parts (chunks) that can be processed independently. Chunking is required if source documents are too large for the maximum input size of embedding or large language models, but you might find it gives you a better index structure for RAG patterns and chat-style search.

The following diagram shows the components of chunked indexing.

Diagram of chunking and vectorization workflow.

Next steps