Scalability and Performance
Important
Starting on the 20th of September, 2023 you won’t be able to create new Personalizer resources. The Personalizer service is being retired on the 1st of October, 2026.
High-performance and high-traffic websites and applications have two main factors to consider with Personalizer for scalability and performance:
- Keeping low latency when making Rank API calls
- Making sure training throughput keeps up with event input
Personalization can return a rank rapidly, with most of the call duration dedicated to communication through the REST API. Azure will autoscale the ability to respond to requests rapidly.
Low-latency scenarios
Some applications require low latencies when returning a rank. Low latencies are necessary:
- To keep the user from waiting a noticeable amount of time before displaying ranked content.
- To help a server that is experiencing extreme traffic avoid tying up scarce compute time and network connections.
Scalability and training throughput
Personalizer works by updating a model that is retrained based on messages sent asynchronously by Personalizer after Rank and Reward APIs. These messages are sent using an Azure EventHub for the application.
It's unlikely most applications will reach the maximum joining and training throughput of Personalizer. While reaching this maximum won't slow down the application, it would imply event hub queues are getting filled internally faster than they can be cleaned up.
How to estimate your throughput requirements
- Estimate the average number of bytes per ranking event adding the lengths of the context and action JSON documents.
- Divide 20MB/sec by this estimated average bytes.
For example, if your average payload has 500 features and each is an estimated 20 characters, then each event is approximately 10 kb. With these estimates, 20,000,000 / 10,000 = 2,000 events/sec, which is about 173 million events/day.
If you're reaching these limits, please contact our support team for architecture advice.