Data Factory to Azure Search Index Mapping Issue - StringCollection Fields Not Visible

RobPi-8032 0 Reputation points
2024-02-12T00:57:04.87+00:00

I am encountering an issue while setting up a pipeline in Azure Data Factory to transfer data from a managed PostgreSQL database to Azure AI Search service.
In Azure Data Factory, during the mapping configuration step, fields that are defined as StringCollection in the Azure Search Index schema are not being displayed for mapping (see image).

Steps to Reproduce:

  1. Created an index in Azure AI Search service with fields defined as per the schema requirements.
  2. Configured the source (managed PostgreSQL on Azure) and sink (Azure AI Search service) in a data pipeline within Azure Data Factory.
  3. Attempted to map the columns from PostgreSQL to the corresponding fields in the AI Search index.

Encountered Error:

  • Manual input of the field (either by manually entering the column name in the textbox, or via editing the pipeline JSON) name results in a pipeline error during the sink operation.
  • Error message received: "Failure happened on 'Sink' side. ErrorCode=UserErrorAzureSearchOperation, 'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException, Message=Error happened when writing data to Azure Search Index 'index-name-here'. CloudException RequestId: 'xxx'. Status Code: 'BadRequest'. Details: An unexpected 'PrimitiveValue' node was found when reading from the JSON reader. A 'StartArray' node was expected."

Troubleshooting Attempts:

  • Provided the source data in both string and array formats.
  • Ensured the Type on the Search Index side is set to 'StringCollection' - Collection(Edm.String). Noticed that if the data type on the Search Index side is set to 'Edm.String' without collection, the data can be mapped without issues.

Request: I seek guidance on how to correctly map fields from PostgreSQL to fields defined as StringCollection in the Azure Search Index. Any step-by-step instructions or insights into resolving the 'BadRequest' error would be greatly appreciated.

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
730 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,656 questions
Azure Database for PostgreSQL
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. brtrach-MSFT 15,261 Reputation points Microsoft Employee
    2024-02-15T03:03:45.77+00:00

    @RobPi-8032 Based on the error message you received, it seems like there might be an issue with the format of the data being transferred from PostgreSQL to Azure AI Search service.

    To correctly map fields from PostgreSQL to fields defined as StringCollection in the Azure Search Index, you need to ensure that the data being transferred is in the correct format. Specifically, the data needs to be in a JSON array format. Here are some steps you can follow to ensure that the data is in the correct format:

    1. Ensure that the data type on the Search Index side is set to 'StringCollection' - Collection(Edm.String).
    2. Ensure that the data being transferred from PostgreSQL is in a JSON array format. For example, if you have a field called "tags" that is defined as a StringCollection in the Azure Search Index schema, the data being transferred should look something like this: "tags": ["tag1", "tag2", "tag3"]
    3. When mapping the columns from PostgreSQL to the corresponding fields in the AI Search index, manually enter the column name in the textbox, or via editing the pipeline JSON.

    If you have followed these steps and are still encountering the 'BadRequest' error, it might be helpful to check the Azure Search service logs for more information on the error.

    0 comments No comments