Azure AI Search client library for Java - version 11.7.3
This is the Java client library for Azure AI Search (formerly known as "Azure Cognitive Search"). Azure AI Search service is an AI-powered information retrieval platform that helps developers build rich search experiences and generative AI apps that combine large language models with enterprise data.
Azure AI Search is well suited for the following application scenarios:
- Consolidate varied content types into a single searchable index. To populate an index, you can push JSON documents that contain your content, or if your data is already in Azure, create an indexer to pull in data automatically.
- Attach skillsets to an indexer to create searchable content from images and unstructured documents. A skillset leverages APIs from Azure AI Services for built-in OCR, entity recognition, key phrase extraction, language detection, text translation, and sentiment analysis. You can also add custom skills to integrate external processing of your content during data ingestion.
- In a search client application, implement query logic and user experiences similar to commercial web search engines and chat-style apps.
Use the Azure AI Search client library to:
- Submit queries using vector, keyword, and hybrid query forms.
- Implement filtered queries for metadata, geospatial search, faceted navigation, or to narrow results based on filter criteria.
- Create and manage search indexes.
- Upload and update documents in the search index.
- Create and manage indexers that pull data from Azure into an index.
- Create and manage skillsets that add AI enrichment to data ingestion.
- Create and manage analyzers for advanced text analysis or multi-lingual content.
- Optimize results through semantic ranking and scoring profiles to factor in business logic or freshness.
Source code | Package (Maven) | API reference documentation| Product documentation | Samples
Getting started
Include the package
Include the BOM file
Please include the azure-sdk-bom to your project to take dependency on the General Availability (GA) version of the library. In the following snippet, replace the {bom_version_to_target} placeholder with the version number. To learn more about the BOM, see the AZURE SDK BOM README.
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-sdk-bom</artifactId>
<version>{bom_version_to_target}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
and then include the direct dependency in the dependencies section without the version tag.
<dependencies>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-search-documents</artifactId>
</dependency>
</dependencies>
Include direct dependency
If you want to take dependency on a particular version of the library that is not present in the BOM, add the direct dependency to your project as follows.
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-search-documents</artifactId>
<version>11.7.3</version>
</dependency>
Prerequisites
- Java Development Kit (JDK) with version 8 or above
- Here are details about Java 8 client compatibility with Azure Certificate Authority.
- Azure subscription
- Azure AI Search service
- To create a new search service, you can use the Azure portal, Azure PowerShell, or the Azure CLI. Here's an example using the Azure CLI to create a free instance for getting started:
az search service create --name <mysearch> --resource-group <mysearch-rg> --sku free --location westus
See choosing a pricing tier for more information about available options.
Authenticate the client
To interact with the Search service, you'll need to create an instance of the appropriate client class: SearchClient
for searching indexed documents, SearchIndexClient
for managing indexes, or SearchIndexerClient
for crawling data
sources and loading search documents into an index. To instantiate a client object, you'll need an endpoint and Azure roles or an API key. You can refer to the documentation for more information on supported authenticating approaches
with the Search service.
Get an API Key
An API key can be an easier approach to start with because it doesn't require pre-existing role assignments.
You can get the endpoint and an API key from the search service in the Azure Portal. Please refer the documentation for instructions on how to get an API key.
Alternatively, you can use the following Azure CLI command to retrieve the API key from the search service:
az search admin-key show --service-name <mysearch> --resource-group <mysearch-rg>
Note:
- The example Azure CLI snippet above retrieves an admin key. This allows for easier access when exploring APIs, but it should be managed carefully.
- There are two types of keys used to access your search service: admin (read-write) and query (read-only) keys. Restricting access and operations in client apps is essential to safeguarding the search assets on your service. Always use a query key rather than an admin key for any query originating from a client app.
The SDK provides three clients.
SearchIndexClient
for CRUD operations on indexes and synonym maps.SearchIndexerClient
for CRUD operations on indexers, data sources, and skillsets.SearchClient
for all document operations.
Create a SearchIndexClient
To create a SearchIndexClient/SearchIndexAsyncClient
, you will need the values of the Azure AI Search service
URL endpoint and admin key.
SearchIndexClient searchIndexClient = new SearchIndexClientBuilder()
.endpoint(ENDPOINT)
.credential(new AzureKeyCredential(API_KEY))
.buildClient();
or
SearchIndexAsyncClient searchIndexAsyncClient = new SearchIndexClientBuilder()
.endpoint(ENDPOINT)
.credential(new AzureKeyCredential(API_KEY))
.buildAsyncClient();
Create a SearchIndexerClient
To create a SearchIndexerClient/SearchIndexerAsyncClient
, you will need the values of the Azure AI Search service
URL endpoint and admin key.
SearchIndexerClient searchIndexerClient = new SearchIndexerClientBuilder()
.endpoint(ENDPOINT)
.credential(new AzureKeyCredential(API_KEY))
.buildClient();
or
SearchIndexerAsyncClient searchIndexerAsyncClient = new SearchIndexerClientBuilder()
.endpoint(ENDPOINT)
.credential(new AzureKeyCredential(API_KEY))
.buildAsyncClient();
Create a SearchClient
Once you have the values of the Azure AI Search service URL endpoint and
admin key, you can create the SearchClient/SearchAsyncClient
with an existing index name:
SearchClient searchClient = new SearchClientBuilder()
.endpoint(ENDPOINT)
.credential(new AzureKeyCredential(ADMIN_KEY))
.indexName(INDEX_NAME)
.buildClient();
or
SearchAsyncClient searchAsyncClient = new SearchClientBuilder()
.endpoint(ENDPOINT)
.credential(new AzureKeyCredential(ADMIN_KEY))
.indexName(INDEX_NAME)
.buildAsyncClient();
Create a client using Microsoft Entra ID authentication
You can also create a SearchClient
, SearchIndexClient
, or SearchIndexerClient
using Microsoft Entra ID authentication. Your user or service principal must be assigned the "Search Index Data Reader" role.
Using the DefaultAzureCredential
you can authenticate a service using Managed Identity or a service principal, authenticate as a developer working on an
application, and more all without changing code. Please refer the documentation
for instructions on how to connect to Azure AI Search using Azure role-based access control (Azure RBAC).
Before you can use the DefaultAzureCredential
, or any credential type from Azure.Identity,
you'll first need to install the Azure.Identity package.
To use DefaultAzureCredential
with a client ID and secret, you'll need to set the AZURE_TENANT_ID
,
AZURE_CLIENT_ID
, and AZURE_CLIENT_SECRET
environment variables; alternatively, you can pass those values
to the ClientSecretCredential
also in azure-identity
.
Make sure you use the right namespace for DefaultAzureCredential
at the top of your source file:
import com.azure.identity.DefaultAzureCredential;
import com.azure.identity.DefaultAzureCredentialBuilder;
Then you can create an instance of DefaultAzureCredential
and pass it to a new instance of your client:
String indexName = "nycjobs";
// Get the service endpoint from the environment
String endpoint = Configuration.getGlobalConfiguration().get("SEARCH_ENDPOINT");
DefaultAzureCredential credential = new DefaultAzureCredentialBuilder().build();
// Create a client
SearchClient client = new SearchClientBuilder()
.endpoint(endpoint)
.indexName(indexName)
.credential(credential)
.buildClient();
Send your first search query
To get running with Azure AI Search first create an index following this guide. With an index created you can use the following samples to begin using the SDK.
Key concepts
An Azure AI Search service contains one or more indexes that provide persistent storage of searchable data in
the form of JSON documents. (If you're new to search, you can make a very rough analogy between indexes and database
tables.) The azure-search-documents
client library exposes operations on these resources through two main client types.
SearchClient
helps with:- Searching your indexed documents using vector queries, keyword queries and hybrid queries
- Vector query filters and Text query filters
- Semantic ranking and scoring profiles for boosting relevance
- Autocompleting partially typed search terms based on documents in the index
- Suggesting the most likely matching text in documents as a user types
- Adding, Updating or Deleting Documents documents from an index
SearchIndexClient
allows you to:SearchIndexerClient
allows you to:
Azure AI Search provides two powerful features:
Semantic ranking
Semantic ranking enhances the quality of search results for text-based queries. By enabling semantic ranking on your search service, you can improve the relevance of search results in two ways:
- It applies secondary ranking to the initial result set, promoting the most semantically relevant results to the top.
- It extracts and returns captions and answers in the response, which can be displayed on a search page to enhance the user's search experience.
To learn more about semantic ranking, you can refer to the documentation.
Vector Search
Vector search is an information retrieval technique that uses numeric representations of searchable documents and query strings. By searching for numeric representations of content that are most similar to the numeric query, vector search can find relevant matches, even if the exact terms of the query are not present in the index. Moreover, vector search can be applied to various types of content, including images and videos and translated text, not just same-language text.
To learn how to index vector fields and perform vector search, you can refer to the sample. This sample provides detailed guidance on indexing vector fields and demonstrates how to perform vector search.
Additionally, for more comprehensive information about vector search, including its concepts and usage, you can refer to the documentation. The documentation provides in-depth explanations and guidance on leveraging the power of vector search in Azure AI Search.
Examples
The following examples all use a simple Hotel data set that you can import into your own index from the Azure portal. These are just a few of the basics - please check out our Samples for much more.
- Querying
- Creating an index
- Adding documents to your index
- Retrieving a specific document from your index
- Async APIs
- Create a client that can authenticate in a national cloud
Querying
There are two ways to interact with the data returned from a search query.
Let's explore them with a search for a "luxury" hotel.
Use SearchDocument
like a dictionary for search results
SearchDocument
is the default type returned from queries when you don't provide your own. Here we perform the search,
enumerate over the results, and extract data using SearchDocument
's dictionary indexer.
for (SearchResult searchResult : SEARCH_CLIENT.search("luxury")) {
SearchDocument doc = searchResult.getDocument(SearchDocument.class);
String id = (String) doc.get("hotelId");
String name = (String) doc.get("hotelName");
System.out.printf("This is hotelId %s, and this is hotel name %s.%n", id, name);
}
Use Java model class for search results
Define a Hotel
class.
public static class Hotel {
@SimpleField(isKey = true, isFilterable = true, isSortable = true)
private String id;
@SearchableField(isFilterable = true, isSortable = true)
private String name;
public String getId() {
return id;
}
public Hotel setId(String id) {
this.id = id;
return this;
}
public String getName() {
return name;
}
public Hotel setName(String name) {
this.name = name;
return this;
}
}
Use it in place of SearchDocument
when querying.
for (SearchResult searchResult : SEARCH_CLIENT.search("luxury")) {
Hotel doc = searchResult.getDocument(Hotel.class);
String id = doc.getId();
String name = doc.getName();
System.out.printf("This is hotelId %s, and this is hotel name %s.%n", id, name);
}
It is recommended, when you know the schema of the search index, to create a Java model class.
Search Options
The SearchOptions
provide powerful control over the behavior of our queries.
Let's search for the top 5 luxury hotels with a good rating.
SearchOptions options = new SearchOptions()
.setFilter("rating ge 4")
.setOrderBy("rating desc")
.setTop(5);
SearchPagedIterable searchResultsIterable = SEARCH_CLIENT.search("luxury", options, Context.NONE);
// ...
Creating an index
You can use the SearchIndexClient
to create a search index. Indexes can also define
suggesters, lexical analyzers, and more.
There are multiple ways of preparing search fields for a search index. For basic needs, we provide a static helper method
buildSearchFields
in SearchIndexClient
and SearchIndexAsyncClient
, which can convert Java POJO class into
List<SearchField>
. There are three annotations SimpleFieldProperty
, SearchFieldProperty
and FieldBuilderIgnore
to configure the field of model class.
List<SearchField> searchFields = SearchIndexClient.buildSearchFields(Hotel.class, null);
SEARCH_INDEX_CLIENT.createIndex(new SearchIndex("index", searchFields));
For advanced scenarios, we can build search fields using SearchField
directly.
List<SearchField> searchFieldList = new ArrayList<>();
searchFieldList.add(new SearchField("hotelId", SearchFieldDataType.STRING)
.setKey(true)
.setFilterable(true)
.setSortable(true));
searchFieldList.add(new SearchField("hotelName", SearchFieldDataType.STRING)
.setSearchable(true)
.setFilterable(true)
.setSortable(true));
searchFieldList.add(new SearchField("description", SearchFieldDataType.STRING)
.setSearchable(true)
.setAnalyzerName(LexicalAnalyzerName.EU_LUCENE));
searchFieldList.add(new SearchField("tags", SearchFieldDataType.collection(SearchFieldDataType.STRING))
.setSearchable(true)
.setFilterable(true)
.setFacetable(true));
searchFieldList.add(new SearchField("address", SearchFieldDataType.COMPLEX)
.setFields(new SearchField("streetAddress", SearchFieldDataType.STRING).setSearchable(true),
new SearchField("city", SearchFieldDataType.STRING)
.setSearchable(true)
.setFilterable(true)
.setFacetable(true)
.setSortable(true),
new SearchField("stateProvince", SearchFieldDataType.STRING)
.setSearchable(true)
.setFilterable(true)
.setFacetable(true)
.setSortable(true),
new SearchField("country", SearchFieldDataType.STRING)
.setSearchable(true)
.setFilterable(true)
.setFacetable(true)
.setSortable(true),
new SearchField("postalCode", SearchFieldDataType.STRING)
.setSearchable(true)
.setFilterable(true)
.setFacetable(true)
.setSortable(true)
));
// Prepare suggester.
SearchSuggester suggester = new SearchSuggester("sg", Collections.singletonList("hotelName"));
// Prepare SearchIndex with index name and search fields.
SearchIndex index = new SearchIndex("hotels").setFields(searchFieldList).setSuggesters(suggester);
// Create an index
SEARCH_INDEX_CLIENT.createIndex(index);
Retrieving a specific document from your index
In addition to querying for documents using keywords and optional filters, you can retrieve a specific document from your index if you already know the key. You could get the key from a query, for example, and want to show more information about it or navigate your customer to that document.
Hotel hotel = SEARCH_CLIENT.getDocument("1", Hotel.class);
System.out.printf("This is hotelId %s, and this is hotel name %s.%n", hotel.getId(), hotel.getName());
Adding documents to your index
You can Upload
, Merge
, MergeOrUpload
, and Delete
multiple documents from an index in a single batched request.
There are a few special rules for merging
to be aware of.
IndexDocumentsBatch<Hotel> batch = new IndexDocumentsBatch<>();
batch.addUploadActions(Collections.singletonList(new Hotel().setId("783").setName("Upload Inn")));
batch.addMergeActions(Collections.singletonList(new Hotel().setId("12").setName("Renovated Ranch")));
SEARCH_CLIENT.indexDocuments(batch);
The request will throw IndexBatchException
by default if any of the individual actions fail, and you can use
findFailedActionsToRetry
to retry on failed documents. There's also a throwOnAnyError
option, and you can set it
to false
to get a successful response with an IndexDocumentsResult
for inspection.
Async APIs
The examples so far have been using synchronous APIs, but we provide full support for async APIs as well. You'll need to use SearchAsyncClient.
SEARCH_ASYNC_CLIENT.search("luxury")
.subscribe(result -> {
Hotel hotel = result.getDocument(Hotel.class);
System.out.printf("This is hotelId %s, and this is hotel name %s.%n", hotel.getId(), hotel.getName());
});
Authenticate in a National Cloud
To authenticate in a National Cloud, you will need to make the following additions to your client configuration:
- Set the
AuthorityHost
in the credential options or via theAZURE_AUTHORITY_HOST
environment variable - Set the
audience
inSearchClientBuilder
,SearchIndexClientBuilder
, orSearchIndexerClientBuilder
// Create a SearchClient that will authenticate through AAD in the China national cloud.
SearchClient searchClient = new SearchClientBuilder()
.endpoint(ENDPOINT)
.indexName(INDEX_NAME)
.credential(new DefaultAzureCredentialBuilder()
.authorityHost(AzureAuthorityHosts.AZURE_CHINA)
.build())
.audience(SearchAudience.AZURE_CHINA)
.buildClient();
Troubleshooting
See our troubleshooting guide for details on how to diagnose various failure scenarios.
General
When you interact with Azure AI Search using this Java client library, errors returned by the service correspond
to the same HTTP status codes returned for REST API requests. For example, the service will return a 404
error if you try to retrieve a document that doesn't exist in your index.
Handling Search Error Response
Any Search API operation that fails will throw an HttpResponseException
with helpful
Status codes
. Many of these errors are recoverable.
try {
Iterable<SearchResult> results = SEARCH_CLIENT.search("hotel");
} catch (HttpResponseException ex) {
// The exception contains the HTTP status code and the detailed message
// returned from the search service
HttpResponse response = ex.getResponse();
System.out.println("Status Code: " + response.getStatusCode());
System.out.println("Message: " + ex.getMessage());
}
You can also easily enable console logging if you want to dig deeper into the requests you're making against the service.
Enabling Logging
Azure SDKs for Java provide a consistent logging story to help aid in troubleshooting application errors and expedite their resolution. The logs produced will capture the flow of an application before reaching the terminal state to help locate the root issue. View the logging wiki for guidance about enabling logging.
Default HTTP Client
By default, a Netty based HTTP client will be used. The HTTP clients wiki provides more information on configuring or changing the HTTP client.
Next steps
- Samples are explained in detail here.
- Read more about the Azure AI Search service
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.