Edit

Claude models in Microsoft Foundry (preview)

Anthropic's Claude models bring advanced conversational AI capabilities to Microsoft Foundry, providing state-of-the-art language understanding and generation for intelligent applications. Claude models excel at complex reasoning, code generation, and multimodal tasks including image analysis. This article describes the available Claude models, how they're hosted and billed, their API surface, capabilities, quotas, and best practices.

To deploy and call a Claude model, see Deploy and use Claude models in Microsoft Foundry.

Available Claude models

Claude models in Foundry include:

Model family Models
Claude Mythos claude-mythos-51 (gated research preview), claude-mythos-preview1 (gated research preview)
Claude Fable claude-fable-5 (preview)
Claude Opus claude-opus-4-82 (preview), claude-opus-4-7 (preview), claude-opus-4-6 (preview), claude-opus-4-5 (preview), claude-opus-4-1 (preview)
Claude Sonnet claude-sonnet-4-6 (preview), claude-sonnet-4-5 (preview)
Claude Haiku claude-haiku-4-5 (preview)

1 Claude Mythos 5 and Claude Mythos Preview are only available as gated research preview. Access to the models is granted solely at Anthropic's discretion and prioritized for defensive cybersecurity use cases. See the Claude Mythos Preview system card and Claude Mythos 5 system card for responsible use guidance.

2 Follow the Migration guide to migrate Messages API code from Claude Opus 4.7 to Claude Opus 4.8.

For more details about the model capabilities, see capabilities of Claude models.

API surface

Use the Anthropic SDKs and the following Claude APIs:

  • Messages API: Send a structured list of input messages with text or image content. The model generates the next message in the conversation.
  • Token Count API: Count the number of tokens in a message.
  • Files API: Upload and manage files for use with the Claude API without re-uploading content with each request.
  • Skills API: Create custom skills for Claude AI.

You can call the Messages API from the anthropic Python package, the @anthropic-ai/foundry-sdk JavaScript package, or directly through REST. The deployment endpoint follows the shape https://<resource-name>.services.ai.azure.com/anthropic/v1/messages, and REST and JavaScript clients use the anthropic-version: 2023-06-01 header.

Comparison of Claude models

Foundry supports Claude models through global standard deployment. Use the following table to compare models, then see Capabilities for details on the features referenced in the table.

Warning

1M context beta on Claude Sonnet 4.5 were retired on April 30, 2026.

Starting May 1, 2026:

  • Requests greater than 200K tokens that include the context-1m-2025-08-07 beta header on Sonnet 4.5 return an error.
  • Requests 200K tokens or fewer remain unaffected, even with the header present.

To migrate, remove the context-1m-2025-08-07 beta header from your requests. For workloads that require 1M context, migrate to Claude Sonnet 4.6 (where 1M context is generally available) or to Claude Opus 4.6 or Claude Opus 4.7 for higher-intelligence workloads.

Model Context window / Max output Key capabilities Best for
claude-mythos-51 (gated research preview) 1M / 128K
  • Adaptive thinking
  • Image and text input
  • Microsoft Entra ID authentication only
  • Biology and life sciences
  • Cybersecurity (defensive use cases prioritized): vulnerability discovery, attack-surface auditing, red teaming, threat intelligence
  • Autonomous coding
  • Long-running agents
claude-fable-5 (preview) 1M / 128K
  • Adaptive thinking
  • Reasoning over entire codebases and multi-day project context
  • Longer independent work than any prior Claude model
  • Self verification
  • Sub-agent orchestration
  • Refusal stop_reason on dual-use safeguard policies2
  • Cybersecurity
  • Autonomous coding
  • Long-running agents
  • Coding and agents, with deeper reasoning for enterprise workflows
claude-mythos-preview1 (gated research preview) 1M / 128K
  • Adaptive thinking
  • Image and text input
  • Microsoft Entra ID authentication only
  • Cybersecurity (defensive use cases prioritized)
  • Autonomous coding
  • Long-running agents
claude-opus-4-83 (preview) 1M / 128K
  • Adaptive thinking with xhigh effort level
  • Reasoning over entire codebases and multi-day project context
  • High-resolution image input (up to 2576px / 3.75MP)
  • Coding
  • Long-running agents
  • Financial analysis
  • Cybersecurity
  • Computer use
claude-opus-4-7 (preview) 1M / 128K
  • Adaptive thinking
  • Reasoning over entire codebases
  • High-resolution image input (up to 2576px / 3.75MP)
  • Coding
  • Enterprise workflows
  • Long-running agents
  • Multimodal reasoning
  • Financial analysis
  • Cybersecurity
claude-opus-4-6 (preview) 1M / 128K
  • Adaptive thinking
  • Image and text input
  • Computer use
  • Advanced tool use (search, programmatic calling, examples)
  • Coding
  • Enterprise agents
claude-opus-4-5 (preview) 200K / 64K
  • Extended thinking
  • Image and text input
  • Computer use
  • Advanced tool use (search, programmatic calling, examples)
  • Coding
  • Agents
  • Computer use
  • Enterprise workflows
claude-opus-4-1 (preview) 200K / 32K
  • Extended thinking
  • Image and text input
  • Coding
  • Long-running tasks
claude-sonnet-4-6 (preview) 1M / 128K
  • Adaptive thinking
  • Image and text input
  • Computer use
  • Advanced tool use (search, programmatic calling, examples)
  • Coding
  • Agents
  • Enterprise workflows
claude-sonnet-4-5 (preview) 200K / 64K
  • Extended thinking
  • Image and text input
  • Computer use
  • Agents and complex, long-horizon tasks
  • High-volume workloads
claude-haiku-4-5 (preview) 200K / 64K
  • Extended thinking
  • Image and text input
  • Coding
  • Agents

1 Claude Mythos 5 and Claude Mythos Preview are only available as gated research preview. Access to the models is granted solely at Anthropic's discretion and prioritized for defensive cybersecurity use cases. See the Claude Mythos Preview system card and Claude Mythos 5 system card for responsible use guidance.

2 Claude Fable 5 applies additional input/output classifiers that may refuse requests whose content triggers dual-use safeguard policies. When a refusal occurs, the request returns a successful (200) response with a refusal indicator stop_reason: "refusal" instead of model-generated content. You're not billed for input tokens that are refused.

3 Follow the Migration guide to migrate Messages API code from Claude Opus 4.7 to Claude Opus 4.8.

Capabilities

Claude models in Foundry expose two kinds of capabilities: core capabilities for processing, analyzing, and generating content, and tools that let Claude interact with external systems.

Core capabilities

Core capabilities enhance Claude's fundamental abilities for processing, analyzing, and generating content. Foundry supports the following core capabilities for Claude:

  • Large context window: An extended context window that processes larger documents and longer conversations.

  • Image and text input: Strong vision for analyzing charts, graphs, technical diagrams, reports, and other visual assets.

  • Code generation: Advanced code generation, analysis, and debugging.

  • Agent skills: Extend Claude's capabilities with skills.

  • Citations: Ground Claude's responses in source documents.

  • PDF support: Process and analyze text and visual content from PDF documents.

  • Context editing: Automatically manage conversation context with configurable strategies.

  • Extended thinking: Enhanced reasoning for complex tasks, available with all Claude models. The following table shows which thinking parameter types each model supports. The adaptive type allows the model to decide whether to think, based on query complexity and effort level.

    Model adaptive enabled disabled
    claude-mythos-5 Yes No No
    claude-fable-5 Yes No No
    claude-mythos-preview Yes Yes No
    claude-opus-4-8 Yes No Yes
    claude-opus-4-7 Yes No Yes
    claude-opus-4-6 Yes Yes Yes
    claude-sonnet-4-6 Yes Yes Yes
  • Effort: Ability to control the quality/cost tradeoff for responses. Use this parameter with or without enabling thinking. The following table shows which effort levels each model supports. The xhigh level produces the same result as max.

    Model low medium high max xhigh
    claude-mythos-5 Yes Yes Yes No Yes
    claude-fable-5 Yes Yes Yes No Yes
    claude-opus-4-8 Yes Yes Yes Yes Yes
    claude-opus-4-7 Yes Yes Yes Yes Yes
    claude-opus-4-6 Yes Yes Yes Yes No
    claude-sonnet-4-6 Yes Yes Yes Yes No

Tools

Tools enable Claude to interact with external systems, execute code, and perform automated tasks. Foundry supports the following tools for Claude:

  • MCP connector: Connect to remote MCP servers directly from the Messages API without a separate MCP client.
  • Memory: Store and retrieve information across conversations. Build knowledge bases over time, maintain project context, and learn from past interactions.
  • Web fetch: Retrieve full content from specified web pages and PDF documents for in-depth analysis.

For a full list of supported capabilities and tools, see Claude's features overview.

Agent support

How Claude models are hosted and billed

Claude is offered through Foundry Models from partners and community. Models from partners and community that aren't sold by Azure are Non-Microsoft Products under the Product Terms.

Deploying a Claude model requires an Azure Marketplace subscription. Ensure that you have the permissions required to subscribe to model offerings before you deploy.

Quotas, rate limits, and regions

Claude models are available for Global Standard deployment in the following regions:

  • East US2
  • Sweden Central

Rate limits for Claude models in Foundry are measured in Tokens Per Minute (TPM) and Requests Per Minute (RPM). The values are different depending on your subscription type, as listed in the following table. To increase your quota beyond the default limits, submit a request through the quota increase request form.

Pay-as-you-go

Model Deployment type RPM TPM
claude-fable-5 Global Standard 0 0
claude-opus-4-8 Global Standard 40 40,000
claude-opus-4-7 Global Standard 40 40,000
claude-opus-4-6 Global Standard 40 40,000
claude-opus-4-5 Global Standard 40 40,000
claude-opus-4-1 Global Standard 40 40,000
claude-sonnet-4-6 Global Standard 80 80,000
claude-sonnet-4-5 Global Standard 80 80,000
claude-haiku-4-5 Global Standard 80 80,000

Responsible AI considerations

When using Claude models in Foundry, consider these responsible AI practices:

Best practices

Follow these best practices when working with Claude models in Foundry:

Prompt engineering

  • Clear instructions: Provide specific and detailed prompts.
  • Context management: Use the available context window effectively.
  • Role definitions: Use system messages to define the assistant's role and behavior.
  • Structured prompts: Use consistent formatting for better results.

Cost optimization

To optimize your usage and avoid rate limiting:

  • Implement retry logic: Handle 429 responses with exponential backoff.
  • Batch requests: Combine multiple prompts when possible.
  • Monitor token usage: Track your token consumption and request patterns.
  • Use appropriate models: Use the most cost-effective model for your use case. See Available Claude models.