Hello @Srushti Prashant zope
Thanks for reaching out to us, please let us know how and where you trained your model so that we can provide more information since you include three different products' tag.
Assumed you are working on Azure Language Service to train your model, while training a language model using Azure Language Service, encountering negative validation token level accuracy is highly unusual based on my personal experience. Here are some potential causes you may consider (Some of them also works for other products):
- Dataset Mismatch: Double-check the data used for training and validation to ensure they are identical and have the same formatting and preprocessing.
- Data Leakage: Confirm that there is no data leakage between the training and validation sets. Ensure the training and validation datasets are completely independent.
- Model Architecture and Hyperparameters: Azure Language Service provides pre-trained models with default settings. However, you can experiment with different architectures or hyperparameter configurations to optimize performance.
Training Process: Review the training process and parameters to ensure they are correctly set. Verify that the model is being trained for an adequate number of epochs without underfitting or overfitting. Check if the learning rate, batch size, and other training settings are appropriate for your specific task.
- Augment Training Data: If possible, increase the size of your training dataset. A larger and more diverse dataset can help the model generalize better and improve its performance on unseen data.
- Regularization Techniques: Apply regularization techniques to prevent overfitting and improve generalization.
- Error Analysis: Conduct a thorough analysis of the model's errors by examining the incorrectly predicted examples in the validation set.
Fine-tuning with Transfer Learning: Instead of training the model from scratch, consider leveraging transfer learning. Start with a pre-trained model that is already proficient in a related task and fine-tune it on your specific dataset. This approach can save time and potentially yield better results.
If you are working on other products, please let us know and we are happy to help further. Thanks.
Regards,
Yutong
-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.