Features of Azure AI Search

Straipsnis
09/04/2024

Azure AI Search provides information retrieval and uses optional AI integration to extract more value from text and vector content.

The following table summarizes features by category. For more information about how Azure AI Search compares with other search technologies, see Compare search options.

There's feature parity in all Azure public, private, and sovereign clouds, but some features aren't supported in specific regions. For more information, see Choose a region.

Note

Looking for preview features? See the preview features list.

Indexing features

Category	Features
Data sources	Search indexes can accept text from any source, provided it's submitted as a JSON document. Indexers are a feature that automates data import from supported data sources to extract searchable content in primary data stores. Indexers handle JSON serialization for you and most support some form of change and deletion detection. You can connect to a variety of data sources, including OneLake, Azure SQL Database, Azure Cosmos DB, or Azure Blob storage.
Hierarchical and nested data structures	Complex types and collections allow you to model virtually any type of JSON structure within a search index. One-to-many and many-to-many cardinality can be expressed natively through collections, complex types, and collections of complex types.
Linguistic analysis	Analyzers are components used for text processing during indexing and search operations. By default, you can use the general-purpose Standard Lucene analyzer, or override the default with a language analyzer, a custom analyzer that you configure, or another predefined analyzer that produces tokens in the format you require. Language analyzers from Lucene or Microsoft are used to intelligently handle language-specific linguistics including verb tenses, gender, irregular plural nouns (for example, 'mouse' vs. 'mice'), word decompounding, word-breaking (for languages with no spaces), and more. Custom lexical analyzers are used for complex query forms such as phonetic matching and regular expressions.

Vector and hybrid search

Category	Features
Vector indexing	Within a search index, add vector fields to support vector search scenarios. Vector fields can coexist with nonvector fields in the same search document.
Vector queries	Formulate single and multiple vector queries.
Vector search algorithms	Use Hierarchical Navigable Small World (HNSW) or exhaustive K-Nearest Neighbors (KNN) to find similar vectors in a search index.
Vector filters	Apply filters before or after query execution for greater precision during information retrieval.
Hybrid information retrieval	Search for concepts and keywords in a single hybrid query request. Hybrid search consolidates vector and text search, with optional semantic ranking and relevance tuning for best results.
Integrated data chunking and vectorization	Native data chunking through Text Split skill. Native vectorization through vectorizers and embedding skills such as AzureOpenAIEmbeddingModel, Azure AI Vision multimodal, and the AML skill that you can use to connect to endpoints in the Azure AI Studio model catalog. Integrated vectorization provides an end-to-end indexing pipeline from source files to queries.
Integrated vector compression and quantization	Use built-in scalar and binary quantization to reduce vector index size in memory and on disk. You can also forego storage of vectors you don't need, or assign narrow data types to vector fields for reduced storage requirements.

Applied AI and knowledge mining

Category	Features
AI processing during indexing	AI enrichment refers to embedded image and natural language processing in an indexer pipeline that extracts text and information from content that can't otherwise be indexed for full text search. AI processing is achieved by adding and combining skills in a skillset, which is then attached to an indexer. AI can be either built-in skills from Microsoft, such as text translation or Optical Character Recognition (OCR), or custom skills that you provide.
Storing enriched content for analysis and consumption in non-search scenarios	Knowledge store is persistent storage of enriched content, intended for non-search scenarios like knowledge mining and data science processing. A knowledge store is defined in a skillset, but created in Azure Storage as objects or tabular rowsets.
Cached enrichments	Enrichment caching (preview) refers to cached enrichments that can be reused during skillset execution. Caching is particularly valuable in skillsets that include OCR and image analysis, which are expensive to process.

Full text and other query forms

Category	Features
Free-form text search	Full-text search is a primary use case for most search-based apps. Queries can be formulated using a supported syntax. Simple query syntax provides logical operators, phrase search operators, suffix operators, precedence operators. Full Lucene query syntax includes all operations in simple syntax, with extensions for fuzzy search, proximity search, term boosting, and regular expressions.
Relevance	Simple scoring is a key benefit of Azure AI Search. Scoring profiles are used to model relevance as a function of values in the documents themselves. For example, you might want newer products or discounted products to appear higher in the search results. You can also build scoring profiles using tags for personalized scoring based on customer search preferences you've tracked and stored separately. Semantic ranker is premium feature that reranks results based on semantic relevance to the query. Depending on your content and scenario, it can significantly improve search relevance with almost minimal configuration or effort.
Geospatial search	Geospatial functions filter over and match on geographic coordinates. You can match on distance or by inclusion in a polygon shape.
Filters and facets	Faceted navigation is enabled through a single query parameter. Azure AI Search returns a faceted navigation structure you can use as the code behind a categories list, for self-directed filtering (for example, to filter catalog items by price-range or brand). Filters can be used to incorporate faceted navigation into your application's UI, enhance query formulation, and filter based on user- or developer-specified criteria. Create filters using the OData syntax.
User experience	Autocomplete can be enabled for type-ahead queries in a search bar. Search suggestions also works off of partial text inputs in a search bar, but the results are actual documents in your index rather than query terms. Synonyms associates equivalent terms that implicitly expand the scope of a query, without the user having to provide the alternate terms. Hit highlighting applies text formatting to a matching keyword in search results. You can choose which fields return highlighted snippets. Sorting is offered for multiple fields via the index schema and then toggled at query-time with a single search parameter. Paging and throttling your search results is straightforward with the finely tuned control that Azure AI Search offers over your search results.

Security features

Category	Features
Data encryption	Microsoft-managed encryption-at-rest is built into the internal storage layer and is irrevocable. Customer-managed encryption keys that you create and manage in Azure Key Vault can be used for supplemental encryption of indexes and synonym maps. For services created after August 1 2020, CMK encryption extends to data on temporary disks, for full double encryption of indexed content.
Endpoint protection	IP rules for inbound firewall support allows you to set up IP ranges over which the search service will accept requests. Create a private endpoint using Azure Private Link to force all requests through a virtual network.
Inbound access	Role-based access control assigns roles to users and groups in Microsoft Entra ID for controlled access to search content and operations. You can also use key-based authentication if you don't want to use role assignments.
Outbound security (indexers)	Data access through private endpoints allows an indexer to connect to Azure resources that are protected through Azure Private Link. Data access using a trusted identity means that connection strings to external data sources can omit user names and passwords. When an indexer connects to the data source, the resource allows the connection if the search service was previously registered as a trusted service.

Portal features

Category	Features
Tools for prototyping and inspection	Add index is an index designer in the portal that you can use to create a basic schema consisting of attributed fields and a few other settings. After saving the index, you can populate it using an SDK or the REST API to provide the data. Import data wizard creates indexes, indexers, skillsets, and data source definitions. If your data exists in Azure, this wizard can save you significant time and effort, especially for proof-of-concept investigation and exploration. Import and vectorize data creates a full indexing pipeline that includes data chunking and vectorization. The wizard creates all of the objects and configuration settings. Search explorer is used to test queries and refine scoring profiles. Create demo app is used to generate an HTML page that can be used to test the search experience. Debug Sessions is a visual editor that lets you debug a skillset interactively. It shows you dependencies, output, and transformations.
Monitoring and diagnostics	Enable monitoring features to go beyond the metrics-at-a-glance that are always visible in the portal. Metrics on queries per second, latency, and throttling are captured and reported in portal pages with no extra configuration required.

Programmability

Category	Features
REST	Service REST API is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. Management REST API is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for .NET	Azure.Search.Documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. Microsoft.Azure.Management.Search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for Java	com.azure.search.documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. com.microsoft.azure.management.search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for Python	azure-search-documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. azure-mgmt-search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for JavaScript/TypeScript	azure/search-documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. azure/arm-search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.

Bendrinti naudojant