Hello IS! Great to hear you’re fine-tuning your model with Azure OpenAI Service.
You can split your data into multiple JSONL files.
Each file must follow the required format, and during the fine-tuning upload and training process, Azure OpenAI supports multiple files as long as they are properly formatted.
Format Reminder:
Each file must be in .jsonl
(JSON Lines) format, meaning:
{"prompt": "input text", "completion": "desired response"}
Multiple Files – Best Practice:
You can and often should organize your training data by theme into different .jsonl
files (e.g., customer_service.jsonl
, technical_docs.jsonl
, sales_pitch.jsonl
). This helps you:
- Maintain your data more easily
Debug or update specific topics later
Ensure better control over how different data segments influence the model
Then, when creating the fine-tuned model, you can upload and use them together.
Notes:
You can combine files during upload or before fine-tuning depending on the method you're using (e.g., CLI, API).
Azure has some size and token limits (e.g., each file max 100 MB and total dataset should remain within token limits).
Be mindful of balance and duplication across datasets to avoid model bias.
Would you like help with how to format or combine your files, or a command example for uploading in Azure?Hello IS! Great to hear you’re fine-tuning your model with Azure OpenAI Service.
You can split your data into multiple JSONL files.
Each file must follow the required format, and during the fine-tuning upload and training process, Azure OpenAI supports multiple files as long as they are properly formatted.
Format Reminder:
Each file must be in .jsonl
(JSON Lines) format, meaning:
{"prompt": "input text", "completion": "desired response"}
🗂 Multiple Files – Best Practice:
You can and often should organize your training data by theme into different .jsonl
files (e.g., customer_service.jsonl
, technical_docs.jsonl
, sales_pitch.jsonl
). This helps you:
Maintain your data more easily
Debug or update specific topics later
Ensure better control over how different data segments influence the model
Then, when creating the fine-tuned model, you can upload and use them together.
Notes:
- You can combine files during upload or before fine-tuning depending on the method you're using (e.g., CLI, API).
- Azure has some size and token limits (e.g., each file max 100 MB and total dataset should remain within token limits).
- Be mindful of balance and duplication across datasets to avoid model bias.