> For the complete documentation index, see [llms.txt](https://docs.archera.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.archera.ai/help-center/cloud-service-intelligence/cloud-service-intelligence/azure/advanced-services/azure-ai-foundry-openai.md).

# Azure AI Foundry / OpenAI

## Azure AI Foundry & Azure OpenAI — PTU commitment brief

> **Data sourced**: March 2026. Verify current figures at the [Azure Pricing Calculator](https://azure.microsoft.com/en-us/pricing/calculator/).

### Scope note

Azure AI Foundry and Azure OpenAI PTU commitments are unified under a single reservation product: **Microsoft Foundry Provisioned Throughput Reservations**. There is no separate Azure OpenAI reservation — *<mark style="color:$danger;">**a single PTU reservation covers any mix of Foundry Models sold directly by Microsoft, including GPT-4o, o1, DeepSeek, Llama, and others.**</mark>* Reservations are model-independent: you commit to PTU count and deployment type, not to a specific model.

### Coverage summary

PTU reservations apply to provisioned throughput deployments only — Global Provisioned, Data Zone Provisioned, and Regional Provisioned. The hourly billing model (no commitment) is available for short-term or testing scenarios. Term discounts of up to \~64% (1-month) and \~70% (1-year) vs. the hourly rate are available via Azure Reservations. Reservations and deployments are decoupled — purchasing a reservation does not create a deployment and does not guarantee capacity.

**What is covered**: PTU-billed deployments across Global, Data Zone, and Regional Provisioned deployment types. All Foundry Models sold directly by Microsoft. Partial-hour deployments are pro-rated by the minute.

**What is not covered**: Standard token-based pay-as-you-go deployments, fine-tuning hosting, Batch API usage, storage, networking. The legacy Commitment model (PTU-C / Provisioned Classic) is a separate mechanism and is not covered by the new Reservation model. Capacity availability is not guaranteed by purchasing a reservation.

### Azure Hybrid Benefit

> Not applicable. PTUs are a model-independent compute unit with no SQL Server or Windows Server licensing component.

### Deployment type / SKU coverage

| Deployment type           | Reservation | Savings plan | 1-month discount | 1-year discount | Notes                                                                                        |
| ------------------------- | ----------- | ------------ | ---------------- | --------------- | -------------------------------------------------------------------------------------------- |
| Global Provisioned        | ✅ Yes       | ❌ No         | \~64%            | \~70%           | Routes globally. Not interchangeable with Data Zone or Regional reservations.                |
| Data Zone Provisioned     | ✅ Yes       | ❌ No         | \~64%            | \~70%           | Data stays within a geographic zone (e.g. EU). Reservation must match Data Zone type.        |
| Regional Provisioned      | ✅ Yes       | ❌ No         | \~64%            | \~70%           | Strictest data residency. Reservation scoped to region + deployment type.                    |
| Standard (token PAYG)     | ❌ No        | ❌ No         | —                | —               | No commitment instrument. Batch API offers 50% discount for latency-tolerant workloads.      |
| Legacy Commitment (PTU-C) | ⚠️ Legacy   | ❌ No         | varies           | varies          | Not available to new customers or models post-Aug 2024. Migrate to Hourly/Reservation model. |

### 1-month vs. 1-year reservation

**1-month reservation**: Up to \~64% off the hourly rate. Lowest commitment risk — ideal for workloads that are ramping, models that may change, or where regional capacity availability is uncertain. Renews monthly.

**1-year reservation** *(max savings)*: Up to \~70% off the hourly rate. Best for stable production workloads with a predictable PTU baseline. Because reservations and deployments are decoupled, you can change models freely during the term without re-purchasing. No Compute Savings Plan equivalent exists for PTUs — the Reservation is the only commitment instrument.

### Critical: deploy first, reserve second

`PTU capacity availability is dynamic and not guaranteed by quota.`` `<mark style="color:$danger;">**`Always create your deployments before purchasing a reservation`**</mark>**`.`**` ``If you buy a reservation for more PTUs than you can deploy due to regional capacity constraints, you are paying for committed units you cannot use. Reservations are not interchangeable across deployment types — Global, Data Zone, and Regional each require a separate reservation purchase.`

### Legacy commitment migration

Customers on the old Commitment model (PTU-C / Provisioned Classic) can continue using it alongside the new Hourly/Reservation model, but it is not available for new customers and does not support models introduced after August 2024. Microsoft recommends migrating to the Reservation model for all new deployments.

***

**Sources**

* [Microsoft Learn — Save costs with Microsoft Foundry Provisioned Throughput Reservations](https://learn.microsoft.com/en-us/azure/cost-management-billing/reservations/microsoft-foundry)
* [Microsoft Learn — PTU costs and billing (Microsoft Foundry)](https://learn.microsoft.com/en-us/azure/foundry/openai/how-to/provisioned-throughput-onboarding)
* [Microsoft Learn — What is provisioned throughput?](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput)
* [Microsoft Tech Community — Unveiling Azure OpenAI PTU reservations and hourly pricing](https://techcommunity.microsoft.com/blog/azure-ai-foundry-blog/unveiling-azure-openai-service-provisioned-reservations-and-hourly-pricing/4214560)

⚠️ Discount percentages (\~64% monthly, \~70% annual) are based on GPT-4o Global Provisioned pricing as of January 2025. Actual savings vary by model, region, and deployment type. Always verify with the [Azure Pricing Calculator](https://azure.microsoft.com/en-us/pricing/calculator/).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.archera.ai/help-center/cloud-service-intelligence/cloud-service-intelligence/azure/advanced-services/azure-ai-foundry-openai.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.