Azure - LLM Model

Question

Azure - LLM Model

Srushti Prashant zope 0

I attempted to train a language model using identical training and validation datasets. Surprisingly, while the training token level accuracy reached around 63%, the validation token level accuracy dropped to negative values. This raises the question of why this occurred despite the datasets being the same. Furthermore, what can be done to fine-tune the model and enhance its relevancy?

1 answer

Your answer

Answer 1

Hello @Srushti Prashant zope

Thanks for reaching out to us, please let us know how and where you trained your model so that we can provide more information since you include three different products' tag.

Assumed you are working on Azure Language Service to train your model, while training a language model using Azure Language Service, encountering negative validation token level accuracy is highly unusual based on my personal experience. Here are some potential causes you may consider (Some of them also works for other products):

Dataset Mismatch: Double-check the data used for training and validation to ensure they are identical and have the same formatting and preprocessing.
Data Leakage: Confirm that there is no data leakage between the training and validation sets. Ensure the training and validation datasets are completely independent.
Model Architecture and Hyperparameters: Azure Language Service provides pre-trained models with default settings. However, you can experiment with different architectures or hyperparameter configurations to optimize performance.

Training Process: Review the training process and parameters to ensure they are correctly set. Verify that the model is being trained for an adequate number of epochs without underfitting or overfitting. Check if the learning rate, batch size, and other training settings are appropriate for your specific task.

Augment Training Data: If possible, increase the size of your training dataset. A larger and more diverse dataset can help the model generalize better and improve its performance on unseen data.
Regularization Techniques: Apply regularization techniques to prevent overfitting and improve generalization.
Error Analysis: Conduct a thorough analysis of the model's errors by examining the incorrectly predicted examples in the validation set.

Fine-tuning with Transfer Learning: Instead of training the model from scratch, consider leveraging transfer learning. Start with a pre-trained model that is already proficient in a related task and fine-tune it on your specific dataset. This approach can save time and potentially yield better results.

If you are working on other products, please let us know and we are happy to help further. Thanks.

Regards,

Yutong

-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

Share via

Azure - LLM Model

1 answer

Your answer