Will we be able to use Direct Preference Optimization (DPO) on supervised fine tuned models?

Kayla Farivar 75 Reputation points
2024-12-19T02:27:23.95+00:00

I saw on the open ai api that they recommend using supervised fine tuning before using DPO. I noticed on the azure open ai portal that the option to use DPO doesn't show when a fine tuned model is selected. Will this be updated any time soon or does this work through the api?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,546 questions
0 comments No comments
{count} votes

Accepted answer
  1. AshokPeddakotla-MSFT 35,766 Reputation points
    2024-12-19T06:03:51.5833333+00:00

    Kayla Farivar Greetings & Welcome to Microsoft Q&A forum!

    I saw on the open ai api that they recommend using supervised fine tuning before using DPO. I noticed on the azure open ai portal that the option to use DPO doesn't show when a fine tuned model is selected. Will this be updated any time soon or does this work through the api?

    In case if you haven't checked earlier, Azure OpenAI Service is introducing several new fine-tuning features.

    Also, see Announcing Public Preview of Direct Preference Optimization Capabilities with Azure OpenAI Service and What's new in Azure OpenAI Service for more details.

    DPO is supported for the GPT-4o model. GPT-4o-mini support for DPO will follow soon. Users can preference fine-tune the base model of GPT-4o or supervised fine-tuned models of GPT-4o through this functionality.

    I just tried using gpt-4o-2024-08-06 model with public preview of DPO in Azure OpenAI Service and able to see an option. Please see below for more clarity.

    User's image

    Do let me know if that helps or have any other queries.

    If the response helped, please do click Accept Answer and Yes for was this answer helpful.

    Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.