Events
Take the Microsoft Learn Challenge
Nov 19, 11 PM - Jan 10, 11 PM
Ignite Edition - Build skills in Microsoft Azure and earn a digital badge by January 10!
Register nowThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Important
Starting on the 20th of September, 2023 you won’t be able to create new Personalizer resources. The Personalizer service is being retired on the 1st of October, 2026.
Personalizer uses terminology from reinforcement learning. These terms are used in the Azure portal and the APIs.
Learning Loop: You create a Personalizer resource, called a learning loop, for every part of your application that can benefit from personalization. If you have more than one experience to personalize, create a loop for each.
Model: A Personalizer model captures all data learned about user behavior, getting training data from the combination of the arguments you send to Rank and Reward calls, and with a training behavior determined by the Learning Policy.
Online mode: The default learning behavior for Personalizer where your learning loop, uses machine learning to build the model that predicts the top action for your content.
Apprentice mode: A learning behavior that helps warm-start a Personalizer model to train without impacting the applications outcomes and actions.
Personalizer is configured from the Azure portal.
Rewards: configure the default values for reward wait time, default reward, and reward aggregation policy.
Exploration: configure the percentage of Rank calls to use for exploration
Model update frequency: How often the model is retrained.
Data retention: How many days worth of data to store. This can impact offline evaluations, which are used to improve your learning loop.
Rank: Given the actions with features and the context features, use explore or exploit to return the top action (content item).
Actions: Actions are the content items, such as products or promotions, to choose from. Personalizer chooses the top action (returned reward action ID) to show to your users via the Rank API.
Context: To provide a more accurate rank, provide information about your context, for example:
Your specific application may have different context information.
Features: A unit of information about a content item or a user context. Make sure to only use features that are aggregated. Do not use specific times, user IDs, or other non-aggregated data as features.
Exploration: The Personalizer service is exploring when, instead of returning the best action, it chooses a different action for the user. The Personalizer service avoids drift, stagnation, and can adapt to ongoing user behavior by exploring.
Learned Best Action: The Personalizer service uses the current model to decide the best action based on past data.
Experiment Duration: The amount of time the Personalizer service waits for a reward, starting from the moment the Rank call happened for that event.
Inactive Events: An inactive event is one where you called Rank, but you're not sure the user will ever see the result, due to client application decisions. Inactive events allow you to create and store personalization results, then decide to discard them later without impacting the machine learning model.
Reward: A measure of how the user responded to the Rank API's returned reward action ID, as a score between 0 to 1. The 0 to 1 value is set by your business logic, based on how the choice helped achieve your business goals of personalization. The learning loop doesn't store this reward as individual user history.
Evaluation: An offline evaluation determines the best learning policy for your loop based on your application's data.
Learning Policy: How Personalizer trains a model on every event will be determined by some parameters that affect how the machine learning algorithm works. A new learning loop starts with a default Learning Policy, which can yield moderate performance. When running Evaluations, Personalizer creates new learning policies specifically optimized to the use cases of your loop. Personalizer will perform significantly better with policies optimized for each specific loop, generated during Evaluation. The learning policy is named learning settings on the Model and learning settings for the Personalizer resource in the Azure portal.
Apprentice mode provides the following evaluation metrics:
Events
Take the Microsoft Learn Challenge
Nov 19, 11 PM - Jan 10, 11 PM
Ignite Edition - Build skills in Microsoft Azure and earn a digital badge by January 10!
Register now