Christine Kim Greetings & Welcome to Microsoft Q&A forum!
In the "Add your data" feature within Azure OpenAI Chat playground, are these formats listed in the documentation -- .txt, .md, .html, Word files, PowerPoint files, and PDFs -- the only ones available for use as a data source?
I believe, you have already checked this documentation Data formats and file types
Your understanding is correct. Azure OpenAI on your data supports the following filetypes:
-
.txt
-
.md
-
.html
- Microsoft Word files
- Microsoft PowerPoint files
If we want to input either an image file (.jpg, .png) or an excel file (.xlsx, .csv) as our data source instead, would I have to first convert these files to one of the acceptable formats listed above?
Yes, If you want to use an image file (.jpg, .png) or an Excel file (.xlsx, .csv) as your data source, you would need to convert these files to one of the supported formats listed above.
There are some caveats about document structure and how it might affect the quality of responses from the model:
The model provides the best citation titles from markdown (.md
) files.
If a document is a PDF file, the text contents are extracted as a preprocessing step (unless you're connecting your own Azure Cognitive Search index). If your document contains images, graphs, or other visual content, the model's response quality depends on the quality of the text that can be extracted from them.
- If you're converting data from an unsupported format into a supported format, make sure the conversion:
- Doesn't lead to significant data loss.
- Doesn't add unexpected noise to your data.
- Doesn't lead to significant data loss.
Do let us know if that helps or have any other queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further queries do let us know.