Data Source to GPT Deployment

Ionut Dutescu 40 Reputation points
2024-03-25T14:12:20.5366667+00:00

Hello,

I was wondering, how is the data accessed by the model, when adding own data. I have seen that in the playground chat, it shows me the referenced files from the blob storage. However I only have one file in the storage, and in the references it splits the file into multiple files. Why is this so? It seems like it cannot read the whole file and it is missing details from that part of the file. How can I avoid this split?

Thanks!

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,990 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,974 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amira Bedhiafi 27,441 Reputation points
    2024-03-25T16:23:28.6833333+00:00

    Azure Blob Storage is a massively scalable object storage solution for large amounts of unstructured data. When you upload data to Blob Storage, it is stored as a "blob," which can then be accessed by various services, including Azure AI services, for different purposes like training or inference.

    Your issue can be related to :

    • Data chunking
    • Limitation of the API you are using

    If you need to avoid or manage the splitting of files I recommend:

    • Manually split your large files into smaller segments before uploading them to Blob Storage.
    • If possible, adjust the sizes of your files to be within the processing limits of the Azure AI service you're using.
    • Some Azure services and third-party tools offer preprocessing capabilities that can help you prepare your data in a way that's optimized for use with AI models.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.