Azure AI Foundry / OpenAI

Azure AI Foundry & Azure OpenAI — PTU commitment brief

Data sourced: March 2026. Verify current figures at the Azure Pricing Calculator.

Scope note

Azure AI Foundry and Azure OpenAI PTU commitments are unified under a single reservation product: Microsoft Foundry Provisioned Throughput Reservations. There is no separate Azure OpenAI reservation — a single PTU reservation covers any mix of Foundry Models sold directly by Microsoft, including GPT-4o, o1, DeepSeek, Llama, and others. Reservations are model-independent: you commit to PTU count and deployment type, not to a specific model.

Coverage summary

PTU reservations apply to provisioned throughput deployments only — Global Provisioned, Data Zone Provisioned, and Regional Provisioned. The hourly billing model (no commitment) is available for short-term or testing scenarios. Term discounts of up to ~64% (1-month) and ~70% (1-year) vs. the hourly rate are available via Azure Reservations. Reservations and deployments are decoupled — purchasing a reservation does not create a deployment and does not guarantee capacity.

What is covered: PTU-billed deployments across Global, Data Zone, and Regional Provisioned deployment types. All Foundry Models sold directly by Microsoft. Partial-hour deployments are pro-rated by the minute.

What is not covered: Standard token-based pay-as-you-go deployments, fine-tuning hosting, Batch API usage, storage, networking. The legacy Commitment model (PTU-C / Provisioned Classic) is a separate mechanism and is not covered by the new Reservation model. Capacity availability is not guaranteed by purchasing a reservation.

Azure Hybrid Benefit

Not applicable. PTUs are a model-independent compute unit with no SQL Server or Windows Server licensing component.

Deployment type / SKU coverage

Deployment type
Reservation
Savings plan
1-month discount
1-year discount
Notes

Global Provisioned

✅ Yes

❌ No

~64%

~70%

Routes globally. Not interchangeable with Data Zone or Regional reservations.

Data Zone Provisioned

✅ Yes

❌ No

~64%

~70%

Data stays within a geographic zone (e.g. EU). Reservation must match Data Zone type.

Regional Provisioned

✅ Yes

❌ No

~64%

~70%

Strictest data residency. Reservation scoped to region + deployment type.

Standard (token PAYG)

❌ No

❌ No

No commitment instrument. Batch API offers 50% discount for latency-tolerant workloads.

Legacy Commitment (PTU-C)

⚠️ Legacy

❌ No

varies

varies

Not available to new customers or models post-Aug 2024. Migrate to Hourly/Reservation model.

1-month vs. 1-year reservation

1-month reservation: Up to ~64% off the hourly rate. Lowest commitment risk — ideal for workloads that are ramping, models that may change, or where regional capacity availability is uncertain. Renews monthly.

1-year reservation (max savings): Up to ~70% off the hourly rate. Best for stable production workloads with a predictable PTU baseline. Because reservations and deployments are decoupled, you can change models freely during the term without re-purchasing. No Compute Savings Plan equivalent exists for PTUs — the Reservation is the only commitment instrument.

Critical: deploy first, reserve second

PTU capacity availability is dynamic and not guaranteed by quota. Always create your deployments before purchasing a reservation. If you buy a reservation for more PTUs than you can deploy due to regional capacity constraints, you are paying for committed units you cannot use. Reservations are not interchangeable across deployment types — Global, Data Zone, and Regional each require a separate reservation purchase.

Legacy commitment migration

Customers on the old Commitment model (PTU-C / Provisioned Classic) can continue using it alongside the new Hourly/Reservation model, but it is not available for new customers and does not support models introduced after August 2024. Microsoft recommends migrating to the Reservation model for all new deployments.


Sources

⚠️ Discount percentages (~64% monthly, ~70% annual) are based on GPT-4o Global Provisioned pricing as of January 2025. Actual savings vary by model, region, and deployment type. Always verify with the Azure Pricing Calculator.

Last updated

Was this helpful?