How to host private images for fine-tuning GPT4o?

Tracy Rohlin 0 Reputation points
2025-03-14T18:00:14.8766667+00:00

I'm trying to fine-tune a gpt-4o model using some internal image data but running into trouble with specifying the path in the train/validation jsonl file. I initially encoded them as base64 images but even a single image causes the training file to go over the file limit size. Then i uploaded them to our blob storage container and changed the img url to those paths but when i try to fine-tune, I get an error saying that the urls must be publicly available.

Status : Training file: Preprocessing Summary: The provided data failed validation. Number of skipped multimodal examples exceed the maximum allowed 200 limit: inaccessible URL (3124). Please visit our docs to learn how to resolve these issues, and try again. Details - Samples of lines per error type: inaccessible URL: Line numbers --> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,913 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Prashanth Veeragoni 3,215 Reputation points Microsoft External Staff
    2025-03-17T07:34:02.34+00:00

    Hi Tracy Rohlin,

    The issue here is that Azure OpenAI requires image URLs to be publicly accessible for fine-tuning GPT-4o. Since your images are stored in Azure Blob Storage, they are private by default, which is causing the "inaccessible URL" error.

    Here, Azure OpenAI does not support private image hosting for fine-tuning GPT-4o. The model requires that all image URLs be publicly accessible over the internet.

    It does not have authentication mechanisms for private storage.

    It validates URLs before training and skips images that are not accessible.

    You cannot directly fine-tune GPT-4o with private images, but you can temporarily expose them using SAS tokens or an API gateway. The best solution depends on your security requirements and ease of implementation.

    However, there are ways to temporarily allow access to private images securely:

    Possible Workarounds for Private Image Hosting:

    If you want to fine-tune with private images but not expose them permanently, you have two options:

    Use Temporary SAS (Shared Access Signature) Tokens (Recommended):

    Generate temporary SAS URLs (expiring after fine-tuning).

    Use these URLs in your .jsonl fine-tuning file.

    Once training is done, revoke the SAS token.

    Keeps images private when not in use. Ensures Azure OpenAI can access them during fine-tuning.

    Steps to Generate SAS URLs for Blob Storage:

    Go to Azure Portal → Navigate to your Storage Account → Open Blob Storage.

    Select the container where your images are stored.

    Click on Shared Access Signature (SAS) under Settings.

    Generate a SAS Token with "Read" permissions:

    Allowed services: Blob

    Allowed resource types: Object

    Allowed permissions: Read

    Set an expiry date (e.g., 7 days or more)

    Copy the SAS Token URL and append it to your image URLs in the .jsonl file.

    Example of updated image URL:

    {
      "image_url": "https://yourstorageaccount.blob.core.windows.net/yourcontainer/image1.jpg?<SAS_TOKEN>",
      "prompt": "Describe this image."
    }
    

    Set Up a Secure API Gateway:

    Host images on Azure Functions or an API.

    The API dynamically authenticates and serves images.

    Your .jsonl file will contain API endpoints instead of direct image URLs.

    Maintains full security control over images. Requires extra development effort.

    Hope this helps. Do let us know if you any further queries.  

    ------------- 

    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    Thank you. 


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.