Hosting Costs? (Cognitive Services - gpt-4o-0806-FT-Hstng-regnl Unit )

Jake Truong-Jones 0 Reputation points
2025-01-21T22:35:31.6833333+00:00

We are incurring what appears to be Hosting Costs for a fine-tuning model. I am looking for some clarity on how / and when this cost is incurred. It appears to be coming from the $1.7/hr cost to host the 4o model. Screenshot 2025-01-21 at 2.26.43 PM

When is it considered "hosted"? We do not have any VM's deployed, however the API is active. Screenshot 2025-01-21 at 2.32.00 PM

Is there anyway to pause this while we are not using it so we do not incur the $1.7 per hour cost?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,124 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator
    2025-01-22T21:36:41.07+00:00

    Hello Jake Truong-Jones,

    Greetings and Welcome to Microsoft Q&A! Thanks for posting the question.

    I understand that you are dealing with hosting costs for Azure's Cognitive Services, particularly for the fine-tuning of models like GPT-4o, it's essential to understand how these costs are incurred and managed,

    When is it considered "hosted"? We do not have any VM's deployed, however the API is active.

    Hosting costs are incurred whenever the fine-tuned model is deployed and available for inference, regardless of whether virtual machines (VMs) are deployed. Charges apply as long as the API endpoint is active, as resources are maintained to ensure low-latency responses and availability for API requests.

    Is there any way to pause this while we are not using it so we do not incur the $1.7 per hour cost?Unfortunately, Azure OpenAI does not provide a direct "pause" feature for fine-tuned model deployments. Hosting charges are incurred as long as the model deployment is active, even when not in use.

    To manage costs, Delete the Deployment when the model is not required and Re-deploy When Needed to make it available again during active usage periods.

    Implementing Budgeting and Alerts in Azure can help you monitor spending by setting thresholds and receiving notifications about resource usage.

    Regularly Review Usage patterns to identify inefficiencies and optimize deployment strategies, ensuring cost-effective utilization of Azure Cognitive Services.

    Kindly refer this Plan to manage costs for Azure OpenAI Service and Plan and manage costs for Azure AI Foundry,

    I Hope this helps. Do let me know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    Thank you!


  2. Lach, Alexander 0 Reputation points
    2025-02-08T15:43:02.4566667+00:00

    This is a major downside compared to using OpenAI directly. With OpenAI, there's no flat cost for hosting a fine-tuned model.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.