Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform
Hi Oliver Su (Artech Consulting LLC)
Example of how to define the input and reference for custom grader in the multi grader scenario please
{
"model": "gpt-5-mini-2025-08-07",
"method": {
"type": "reinforcement",
"reinforcement": {
"hyperparameters": {
"n_epochs": 3,
"batch_size": 8,
"eval_interval": 1,
"eval_samples": 5
},
"grader": {
"name": "summary_quality_multigrader",
"type": "multi",
"graders": {
"text_sim": {
"name": "text_similarity_grader",
"type": "text_similarity",
// Model Output → Reference mapping
"input": "{{sample.output_json.response}}",
"reference": "{{item.reference.answer}}",
"evaluation_metric": "fuzzy_match"
},
"custom_quality": {
"name": "custom_summary_quality",
"type": "python",
// Model Output → Reference mapping
"input": "{{sample.output_json.response}}",
"reference": "{{item.reference.answer}}",
"source": "def grade(sample_text: str, reference_text: str) -> float:\n"
" # Reward mention of 'AI Foundry' and brevity (< 20 words)\n"
" if not sample_text:\n"
" return 0.0\n"
" score = 0.0\n"
" if 'AI Foundry' in sample_text:\n"
" score += 0.5\n"
" if len(sample_text.split()) < 20:\n"
" score += 0.5\n"
" return min(score, 1.0)"
}
},
// Weighted aggregation across graders
"calculate_output": "0.6 * text_sim + 0.4 * custom_quality",
"invalid_grade": 0.0
}
}
}
}
Dataset: JSONL with clear split of input (what the model sees) and reference (what graders use).
Bindings: In each grader, set "input" to model output path and "reference" to ground truth path.
Custom grader: Python function returning a score in [0,1] (optionally a dict with score/reason if supported).
Aggregation: Use a weighted expression like "0.6 * text_sim + 0.4 * custom_quality".
Validation: Provide an invalid_grade fallback for edge cases.
I Hope this helps.
Thank you!