@Jordan C Apologies for the late reply.
I had reached out to the product owners and reported this issue. I have heard back from them. Below is their analysis:
.
This issue is by-design.
- In different regions, the CPUs (AMD or Intel, serials maybe also are different). If the CPU is the same in 2 regions, the scores should be consistent.
- Except en-US, other locales are zero-scales which means if there is no requests for several hours (maybe 6 hours) in this region, the deployment will be released and a fall back model will be triggered. After we monitor there a request for this region, the default model deployment (about 10-15 minutes) for this region will start.
In short, this behavior you are seeing is by design to save the cost.
Unfortunately, this is no public facing documentation on this.
Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.
**
Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.