How long does it typically take to train a dataset for a single PDF when fine-tuning a model with Azure AI?

Question

How long does it typically take to train a dataset for a single PDF when fine-tuning a model with Azure AI?

Paritosh Raval 5

I am planning to finetune GPT-3.5 Turbo with Azure AI. I have a PDF document consisting of 10 pages, containing approximately 17,000 characters and 3,200 tokens. Now, I intend to fine-tune it using GPT-3.5 Turbo. How long will it actually take to train the model on the specified dataset? Certainly. Considering the training cost of approximately $68 per hour, I aim to calculate the total training cost based on the provided example of tokens.

1 answer

Your answer

Answer 1

Thanks for the question, Here is the document for gpt 3.5 turbo pricing. https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/ Training Data Preparation:

You have a PDF document with 10 pages, which translates to approximately 17,000 characters.
- The document contains 3,200 tokens. Tokens are chunks of text (words, subwords, or characters) that the model processes during training.

Fine-Tuning Duration:

The time it takes to fine-tune a model depends on factors like the dataset size and the specific use case.
For your dataset size, fine-tuning may take several hours or even days.
- In an example where the dataset had 5,500 tokens, it took over 6 hours for fine-tuning.
To estimate the total cost, we can calculate the number of hours needed for fine-tuning based on the token count:
- Tokens in your dataset: 3,200
- Tokens per hour (example): 5,500
- Estimated fine-tuning time: (\frac{{3,200}}{{5,500}}) (\times) 6 hours ≈ 3.5 hours
- Total cost: 3.5 hours (\times) $68/hour = $238

Share via

How long does it typically take to train a dataset for a single PDF when fine-tuning a model with Azure AI?

1 answer

Your answer