Догађаји
Изградите интелигентне апликације
17. мар 23 - 21. мар 23
Придружите се серији састанака како бисте изградили скалабилна АИ решења заснована на стварним случајевима коришћења са колегама програмерима и стручњацима.
Региструјте се одмахОвај прегледач више није подржан.
Надоградите на Microsoft Edge бисте искористили најновије функције, безбедносне исправке и техничку подршку.
Важно
Starting on the 20th of September, 2023 you won’t be able to create new Personalizer resources. The Personalizer service is being retired on the 1st of October, 2026.
Offline evaluation is a method that allows you to test and assess the effectiveness of the Personalizer Service without changing your code or affecting user experience. Offline evaluation uses past data, sent from your application to the Rank and Reward APIs, to compare how different ranks have performed.
Offline evaluation is performed on a date range. The range can finish as late as the current time. The beginning of the range can't be more than the number of days specified for data retention.
Offline evaluation can help you answer the following questions:
In addition, Offline Evaluation can be used to discover more optimized learning policies that Personalizer can use to improve results in the future.
Offline evaluations do not provide guidance as to the percentage of events to use for exploration.
The following are important considerations for the representative offline evaluation:
Personalizer can use the offline evaluation process to discover a more optimal learning policy automatically.
After performing the offline evaluation, you can see the comparative effectiveness of Personalizer with that new policy compared to the current online policy. You can then apply that learning policy to make it effective immediately in Personalizer, by downloading it and uploading it in the Models and Policy panel. You can also download it for future analysis or use.
Current policies included in the evaluation:
Learning settings | Purpose |
---|---|
Online Policy | The current Learning Policy used in Personalizer |
Baseline | The application's default (as determined by the first Action sent in Rank calls) |
Random Policy | An imaginary Rank behavior that always returns random choice of Actions from the supplied ones. |
Custom Policies | Additional Learning Policies uploaded when starting the evaluation. |
Optimized Policy | If the evaluation was started with the option to discover an optimized policy, it will also be compared, and you will be able to download it or make it the online learning policy, replacing the current one. |
When you run an offline evaluation, it is very important to analyze confidence bounds of the results. If they are wide, it means your application hasn’t received enough data for the reward estimates to be precise or significant. As the system accumulates more data, and you run offline evaluations over longer periods, the confidence intervals become narrower.
Offline Evaluations are done using a method called Counterfactual Evaluation.
Personalizer is built on the assumption that users' behavior (and thus rewards) are impossible to predict retrospectively (Personalizer can't know what would have happened if the user had been shown something different than what they did see), and only to learn from measured rewards.
This is the conceptual process used for evaluations:
[For a given _learning policy), such as the online learning policy, uploaded learning policies, or optimized candidate policies]:
{
Initialize a virtual instance of Personalizer with that policy and a blank model;
[For every chronological event in the logs]
{
- Perform a Rank call
- Compare the reward of the results against the logged user behavior.
- If they match, train the model on the observed reward in the logs.
- If they don't match, then what the user would have done is unknown, so the event is discarded and not used for training or measurement.
}
Add up the rewards and statistics that were predicted, do some aggregation to aid visualizations, and save the results.
}
The offline evaluation only uses observed user behavior. This process discards large volumes of data, especially if your application does Rank calls with large numbers of actions.
Offline evaluations can provide information about how much of the specific features for actions or context are weighing for higher rewards. The information is computed using the evaluation against the given time period and data, and may vary with time.
We recommend looking at feature evaluations and asking:
Configure Personalizer Run Offline Evaluations Understand How Personalizer Works
Догађаји
Изградите интелигентне апликације
17. мар 23 - 21. мар 23
Придружите се серији састанака како бисте изградили скалабилна АИ решења заснована на стварним случајевима коришћења са колегама програмерима и стручњацима.
Региструјте се одмахОбука
Модул
Make recommendations with Azure AI Personalizer - Training
In this module, learn how to use Azure AI Personalizer to make recommendations.