Data Source to GPT Deployment

Ionut Dutescu 20 Reputation points


I was wondering, how is the data accessed by the model, when adding own data. I have seen that in the playground chat, it shows me the referenced files from the blob storage. However I only have one file in the storage, and in the references it splits the file into multiple files. Why is this so? It seems like it cannot read the whole file and it is missing details from that part of the file. How can I avoid this split?


Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,404 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,336 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amira Bedhiafi 14,481 Reputation points

    Azure Blob Storage is a massively scalable object storage solution for large amounts of unstructured data. When you upload data to Blob Storage, it is stored as a "blob," which can then be accessed by various services, including Azure AI services, for different purposes like training or inference.

    Your issue can be related to :

    • Data chunking
    • Limitation of the API you are using

    If you need to avoid or manage the splitting of files I recommend:

    • Manually split your large files into smaller segments before uploading them to Blob Storage.
    • If possible, adjust the sizes of your files to be within the processing limits of the Azure AI service you're using.
    • Some Azure services and third-party tools offer preprocessing capabilities that can help you prepare your data in a way that's optimized for use with AI models.
    0 comments No comments

0 additional answers

Sort by: Most helpful