Azure OpenAI Service: Characters are converted to Unicode when indexing with Japanese files in Studio.

大宮 僚馬 70 Reputation points
2023-12-08T01:50:52.2033333+00:00

Previously, when indexing Japanese files, they were still in Japanese, but when I tried recently, the characters were converted to Unicode.

Upon investigation, we found that the API used in the Azure Cognitive Search skill set has changed, and we believe this may be the cause.

Before the change

https://XXX.openai.azure.com/openai/chunks?api-version=2023-03-31-preview

After change

https://XXX.openai.azure.com/openai/preprocessing-jobs?api-version=2023-03-31-preview

How can I create the index in Japanese as it was before the change?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,339 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,081 questions
{count} votes

Accepted answer
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2023-12-15T03:11:41+00:00

    大宮 僚馬 I'm glad that your issue is resolved and thank you for posting your solution so that others experiencing the same thing can easily reference this!

    Since the Microsoft Q&A community has a policy that the question author cannot accept their own answer, they can only accept answers by others, I'll repost your solution in case you'd like to Accept the answer.

    Error Message:

    Previously, when indexing Japanese files, they were still in Japanese, but when I tried recently, the characters were converted to Unicode.

    Upon investigation, we found that the API used in the Azure Cognitive Search skill set has changed, and we believe this may be the cause.

    Before the change

    https://XXX.openai.azure.com/openai/chunks?api-version=2023-03-31-preview

    After change

    https://XXX.openai.azure.com/openai/preprocessing-jobs?api-version=2023-03-31-preview

    How can I create the index in Japanese as it was before the change?

    **
    Solution :

    I ran it last night and it fixed the problem.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.

    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. 大宮 僚馬 70 Reputation points
    2023-12-15T00:05:25.8666667+00:00

    I ran it last night and it fixed the problem.

    0 comments No comments

  2. Mads Olsgaard 0 Reputation points
    2024-01-22T06:00:09.3733333+00:00

    I have also noted this API being called by Azure Cognitive Search skill set created via the OpenAI Studio or Azure AI studio. As far as I can tell, this API is completely undocumented. https://XXX.openai.azure.com/openai/preprocessing-jobs?api-version=2023-03-31-preview Where is documentation for this published? One would assume it should be https://learn.microsoft.com/en-us/azure/ai-services/openai/reference, but it is not.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.