Hi @Michael Dong,
Thank you for reaching out to Microsoft Q&A forum!
You can A/B test your two fine-tuned Qwen2.5 models in Azure AI Services by deploying both versions as separate endpoints. Then, split the traffic between them, using Azure tools like Load Balancer.
Track performance metrics, like response time and user feedback. After collecting enough data, compare the results to see which model performs better and use that version going forward.
I hope this helps. Thank you.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.