"Add your data" file formats in Chat Playground

Christine Kim 40 Reputation points
2023-07-19T21:50:59.6966667+00:00

In the "Add your data" feature within Azure OpenAI Chat playground, are these formats listed in the documentation -- .txt, .md, .html, Word files, PowerPoint files, and PDFs -- the only ones available for use as a data source?

If we want to input either an image file (.jpg, .png) or an excel file (.xlsx, .csv) as our data source instead, would I have to first convert these files to one of the acceptable formats listed above?

Thank you!

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,192 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,599 questions
{count} votes

Accepted answer
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2023-07-20T10:24:24.11+00:00

    Christine Kim Greetings & Welcome to Microsoft Q&A forum!

    In the "Add your data" feature within Azure OpenAI Chat playground, are these formats listed in the documentation -- .txt, .md, .html, Word files, PowerPoint files, and PDFs -- the only ones available for use as a data source?

    I believe, you have already checked this documentation Data formats and file types

    Your understanding is correct. Azure OpenAI on your data supports the following filetypes:

    • .txt
    • .md
    • .html
    • Microsoft Word files
    • Microsoft PowerPoint files
    • PDF

    If we want to input either an image file (.jpg, .png) or an excel file (.xlsx, .csv) as our data source instead, would I have to first convert these files to one of the acceptable formats listed above?

    Yes, If you want to use an image file (.jpg, .png) or an Excel file (.xlsx, .csv) as your data source, you would need to convert these files to one of the supported formats listed above.

    There are some caveats about document structure and how it might affect the quality of responses from the model:

    The model provides the best citation titles from markdown (.md) files.

    If a document is a PDF file, the text contents are extracted as a preprocessing step (unless you're connecting your own Azure Cognitive Search index). If your document contains images, graphs, or other visual content, the model's response quality depends on the quality of the text that can be extracted from them.

    • If you're converting data from an unsupported format into a supported format, make sure the conversion:
      • Doesn't lead to significant data loss.
        • Doesn't add unexpected noise to your data.
        This will impact the quality of Azure Cognitive Search and the model response.

    Do let us know if that helps or have any other queries.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further queries do let us know.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.